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ANALYSIS OF A PRESTIGE FRAME OF 
REFERENCE BY A GRADIENT 
TECHNIQUE 


C. E. OSGOOD anp ROSS STAGNER 
Yale University Dartmouth College 


HE concept of frame of reference has come increasingly 
into use in recent discussions of social psychology. The 
ingenious study by Sherif (9) has illustrated the fact 

that individuals faced with a judgment situation tend to create 
for themselves standards upon which judgments can be made. 
More frequently, frames of reference may be furnished to the 
individual by his education or his culture. Recent studies on 
color-hearing have shown photistic responses to music to be 
largely determined by cultural frames of reference (6, 7). 
Studies of judgment processes in the fields of art appreciation, 
(2), political parties (5) and other complex areas indicate that 
decisions are commonly based upon standards which may be 
unverbalized and perhaps unconscious (10). 

Studies of social stereotypes (for example, of politico-eco- 
nomic stereotypes (11) such as Capitalist, Communist, Fascist) 
have indicated the existence of such frames of reference but 
have made no attempt to analyze these complex patterns. In 
the present investigation we have attempted to develop a rela- 
tively simple technique for such analysis. 

The specific problem chosen for our purpose was the analysis 
of a frame of reference which may be labeled occupational 
prestige or esteem. Numerous studies (1, 3, 4) have demon- 
strated its general function and it has been shown to exist in 
a large proportion of the population. We do not, however, 
know of any attempts to determine the qualities which make 
up or are associated with this general prestige framework—in 
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other words, attempts to examine into the basis of prestige 
judgments. 

We have thus had two purposes in carrying out this study: 
first, to demonstrate a method for analyzing a frame of refer- 
ence, and second, to investigate the particular determinants 
of the frame of reference known as occupational prestige. 


PROCEDURE 


The technique employed requires the subject to rate a group 
of occupationa! stereotypes on a series of continua, the ends 
of which are defined in terms of the psychological opposites 
of these continua. For example, in Figure 1, a, the stereotype 
SURGEON is followed by a scale calling for a judgment in terms 
of the degree of ‘‘brains’’ or ‘‘brawn’’ thought to be charac- 
teristic of this occupation. While this is similar in appearance 
to the rating scales used in so many studies of personality 
traits, it is significantly different here in that we are not con- 
cerned with any hypothetical characteristics of surgeons which 
might be revealed by this method. We are interested in the 
characteristics of the subject’s judgment-process as he reaches 
a decision on these scales. Since we are trying to work out a 
general approach to the problem of analyzing frames of refer- 
ence, we have elected to call the occupational stereotypes 
‘‘concepts’’ and the psychological continua ‘‘gradients.’’ It 
is obvious, then, that a concept can be any object of social 
judgment, and a gradient any continuum with respect to which 
such objects are perceived as differing. 

The fifteen concepts used in the present experiment (see 
Table 1) are common occupations taken at regular intervals 
from a list of 249 for which the general prestige values were 
determined in a study by C. W. Hall (3). Hall used qualify- 
ing phrases with his occupational stereotypes which we were 
forced to eliminate in order to avoid interference with our 
subjects’ judgments. Nevertheless, the total prestige ranking 
of the fifteen occupations in our study correlates .96 with 
Hall’s data, using the median of his ranks where his list in- 
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cluded more than one example of a certain occupation. Ten 
gradients were devised which could be thought of as applying 
in varying degrees to all fifteen occupations. 

It was thought that the subject’s judgments about occupa- 
tions as such might be different from his judgments of the 
persons employed in such jobs. A second form of our test 
blank was therefore prepared with instructions to judge the 
‘‘persons in various occupations’’ as to the listed character- 
istics. Three of the gradients used on the ‘‘job’’ form were 
also employed on the ‘‘person’’ form. The other seven were 
traits more applicable to people than to activities. A complete 
list of the occupations and gradients on the two forms is given 
in Tables 1 and 2. 

In order to evoke absolute judgments as far as possible, the 
ten gradients and fifteen concepts were simultaneously rotated. 
Thus SURGEON occurs once in every fifteen items in the test 
form, each time with a different gradient; and ‘‘kind.. . 
cruel’’ occurs once in every ten items, each time with a differ- 
ent concept. Thus, after a subject has made a judgment on 
any of these items, he makes nine other decisions before coming 
to another which is similar in any way. This rotation, to- 
gether with the speed with which this type of test is taken, 
served to keep the subjects from comparing their responses to 
different items. In this respect we believe our procedure to 
be significantly superior to that employed by Asch, Block and 
Hertzman (1) who required their subjects to rank occupations 
according to traits such as intelligence, idealism, conscientious- 
ness, ete. This ranking forces the subject to form a gradient 
of differences when none may previously have existed in his 
mind. 

The instructions to subjects taking the ‘‘person’’ form were 
as follows: 

‘““Toward the people in certain occupations you may 
have a more generally favorable attitude than toward 


1 From the subjects’ own reports, we feel confident that little conscious 
transfer was made from item to item. In fact, some of them asked if 
we had repeated each item several times as a check on reliability. 
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people in others. Certain personality traits of a person 
in a certain job may seem more favorable to you than 
others. This test is an attempt to analyze your attitudes 
toward the people in various occupations. 

** Although we do not ask for your name, we do intend 
to use your results, along with the results of other people, 
to determine typical attitudes of your social groups toward 
the people in these occupations. Therefore we ask you to 
mark this test carefully and honestly. You will! find that 
it goes easily and rapidly. 


Directions 


1. Please respond to one item at a time, looking neither 

forward or backward. 

Follow the order of items as marked (1, 2, 3, ete.). Do 

not skip about in the test. 

Please indicate your attitudes by marking all items. 

The capitalized word at the beginning of each item 

refers to the person you think of as holding the indi- 
eated position. You are to show your attitudes 
toward that person or type of person. 

5. The scales identified by small-letter words represent 
characteristics or traits that apply in varying de- 
grees to all people. You are to indicate by a check 
mark on the appropriate part of the scale the degree 
which best shows your attitude.’’ 


oo $9 


The instructions were similar on the ‘‘job’’ form except that 
the emphasis was on occupations, not on persons. After 
marking the 150-item test (here the person form), the subject 
was given these instructions: 


‘‘Please RANK the people in the following positions ac- 
cording to relative presTIGE. That is, rank them accord- 
ing to your idea of the esteem accorded each. Let 1 
indicate the person you consider to have the greatest 
prestige; 2 indicate the person with the next highest 
prestige. Give each person a different rank, using num- 
bers from 1-15.’’ 


On the ‘‘job’’ form, of course, the above instructions were to 
rank occupations on the basis of general prestige. Correlation 
between the two average rankings was +.99. In subsequent 
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statistical analysis, however, the average rankings for prestige 
of occupations were used with gradients on the ‘‘job’’ form and 
those for people in the occupations with gradients appearing 
on the ‘‘person’’ form. 

Note that the general prestige ratings were obtained after 
the marking of the gradients, so that no transfer was possible. 
Further, the instructions at the beginning of the test sought 
to avoid suggesting that general prestige was being studied. 
We believe, therefore, that any prestige frame of reference 
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which may have been employed by the subjects was not due to 
conscious intruction of any kind. 

Our subjects were 100 Dartmouth College men, students in 
introductory psychology. Fifty men filled out the ‘‘job’’ 
form and 50 the ‘‘person’’ form. So far as could be deter- 
mined, each group constituted a random sample of the entire 
class. 

RESULTS 


In treating the results, the first step was to make a fre- 
quency distribution of the 50 judgments for each concept on 
each gradient, t.e., for each item. A sample distribution is 
shown in Figure 1, b. The distribution for suURGEON is highly 
skewed toward the ‘‘brains’’ end of this scale. In Figure 1, e, 


TABLE 3 


Correlations of Gradients with Prestige Criterion 











JoB form PERSON form 
Hopeful (discouraging) ........... 99 
Noticed (disregarded) .............. SS *Brains (eawa) .................. 98 
pT |) 97 
pe | eee 96 
“Exciting (DOTING) ..nccccccccm 96 
°Pleasant (unpleasant) ............ 92 Leader (foll0Wer) ....cccccccnocnnen 92 
“Exciting (DOTINZ) ......ccccc00u 90 
Free (restricted) .ccccccsccveccssemsen 87 
Sociable (Solitary) 0... 86 
Self-assured (indecisive) .......... 84 
Secure (iMSCCure) ..ccccccccesccesnne .79 
Conservative (liberal) ............... 40 
°Pleasant (unpleasant) ............. 38 
Honest (dishonest) 0000... .383 
Kind (cruel) 28 
Short Hours (long hours) ........ .20 
Idealistic (realistic) 00.00... 07 
Comgemial (COLD)  ..ececcccccccsiesmensenn 00 





(The symbols * - ° above indicate gradients appearing on both forms 
of the test. The words in parentheses define the non-prestige end of 
the gradients employed.) 
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the distributions for TarmLor and GARBAGE MAN are superimposed 
on that for suRGEON. The wide differences obtained are appar- 
ent. For statistical purposes, the data were plotted in cumu- 
lative frequency curves and the median and Q values were 
determined graphically. The medians are summarized in 
Tables 1 and 2 for the ‘‘job’’ and ‘‘ person’’ forms respectively. 

The ‘‘general prestige’’ rankings were averaged and trans- 
muted to a seven-point scale so that they would be directly 
comparable to the medians of the gradient distributions. 
Pearson correlations were computed for each gradient with the 
general prestige ranking. These correlations are shown in 
Table 3. 

The correlations obtained for the ‘‘ job’’ form are amazingly 
high. Under the conditions of this study, students rating an 
occupation in terms of hopefulness, being noticed, financial 
return, brains, excitingness and pleasantness react to it in a 
fashion almost identical with their judgments of its general 
prestige. Even freedom, sociability and security give correla- 
tions above .79, while only ‘‘hours of work’’ shows no signifi- 
cant relation to prestige. 

For the person form, very noticeable differences appear. 
Only three of the correlations are above .90, and six are .40 
or less. Brains, leadership, excitingness and self-assuredness 
give high correlations with the general prestige of persons. 
Conservatism, pleasantness and honesty yield correlations be- 
tween .30 and .40, while kindness, idealism and congeniality 
approach zero relationships with the general prestige values. 

Of the three gradients duplicated in the two forms, ‘‘brains’’ 
and ‘‘excitingness’’ give similar results on both forms, but 
‘*pleasantness’’ is relatively independent of prestige for per- 
sons. It seems clear, therefore, that our instructions were 
effective and that the subjects reacted differently to the two 
forms, calling for judgments of occupations and of persons in 
occupations. 

It will be noted that, on the person form, there is a sharp 
dichotomy between what might be called dominance traits 
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and humanitarian traits. Qualities like brains, leadership and 
self-assurance are shown as marks of prestige. Such charac- 
teristics as honesty, kindness, idealism and congeniality defi- 
nitely are not. We do not think it far-fetched to conclude 
from this that our students sense the fact that the so-called 
‘*Christian virtues’’ do not make for success in our modern 
world, at least in the ordinary course of events. Traits of 
personal aggressiveness, on the other hand, they consider to be 
effective determiners of success. 

Although such humane qualities as honesty, kindness and 
congeniality are not related to prestige in the minds of our 
subjects, as shown above, they are indicated as favorable 
values. It will be seen by referring to the averages for the 
various gradients on the ‘‘person’’ form (Table 1) that aver- 
ages for these humane qualities are quite high. In other 
words, the subjects considered all occupational persons quite 
honest and kind. Likewise all were considered quite con- 
genial, self-assured, and pleasant. On the other hand, all the 
persons on the test were thought to be low in idealism (1.e., very 
realistic). These facts strongly suggest that a projection 
mechanism is at work. Our subjects’ own personal values are 
projected on the persons they are judging. 

A consideration of both the correlational data and the 
medians given in Tables 1 and 2 suggests the presence of an 
extremely rigid frame of reference. The correlations are un- 
usually high, even for medians of 50 judgments, and convey 
an impression that, even before the students made their gen- 
eral prestige rankings, a prestige frame of reference was 
determining their decisions. This impression is strengthened 
by a consideration of the means of all ten gradients for the 
fifteen oceupations. As seen in Table 1, there is a consistent 
decrease in the mean rating as we move from the highest 
(SURGEON) to the lowest (GARBAGE MAN), the only exceptions 
being STOREKEEPER, POSTMAN and CARPENTER which are practi- 
cally identical. The same tendency, to a lesser extent, is seen 
in the ‘‘person’’ form (Table 2). The subjects were prone to 
rate esteemed occupations high in all characteristics. 
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The method outlined here is useful not only for revealing 
the stereotyped character of thinking with regard to a class of 
objects, but also as a form of qualitative analysis of specific 
concepts. It is possible to correct for the fact that all the 
ratings are skewed in certain respects (¢.g., all occupations are 
judged as above average in honesty) and that all ratings for 
certain occupations are skewed (e.g., all ratings but one for 
SURGEON are above average), thus reducing all the judgments 
to a common base line. When this is done, qualitative vari- 
ations in stereotypes appear. Figure 2, for example, shows 
judgments of BOND SALESMAN and MUSICIAN on the person form. 
The MUSICIAN is judged high (plus values in figure) in ideal- 
+3 
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ism, kindness and honesty, while the BOND SALESMAN is rela- 
tively low in these. On the other hand, the BOND SALESMAN is 
thought to show traits of self-assurance, congeniality and con- 
servatism much more than the MUSICIAN. But on the three 
gradients which seem to determine prestige, viz., brains, leader- 
ship and excitingness, they are very similar. 
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Another comparison of individual stereotypes may be used 
to emphasize this point. In Figure 3 we compare the BUSINESS 
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MAN and STOREKEEPER on the person form. While the two 
profiles are generally similar, the business man is definitely 
higher on the traits of dominance (brains, leadership and self- 
assurance) ; when we shift to such humanitarian qualities as 
pleasantness, honesty and kindness, the storekeeper is rated 
more favorably. Such comparisons could be made with any set 
of concepts to show the sensitivity of this technique in picking 
out the outstanding features of our stereotyped ideas about 
social reality. 

The measures of variability (Q) have not been found to 
differ in any consistent fashion from concept to concept or 
from gradient to gradient, when the complete set of data is 
considered. These measures have consequently been omitted 
from this article. 
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DISCUSSION 


The notion of ‘‘frame of reference’’ has been of value in 
social psychology because it calls attention to the fact that 
social reality is perceived always in relation to some set of 
standards. Our data have shown that decisions about charac- 
teristics of occupational stereotypes tend to conform closely to 
a framework which is based on the relative prestige of occu- 
pations. The subjects checked the gradients under conditions 
and at a speed which made painstaking judgments impossible. 
We feel justified in inferring, therefore, that the mere presen- 
tation of a group of concepts to be judged in terms of specific 
qualities, automatically sets up a frame of reference. The 
judgment on each scale, then, is unconsciously determined by 
its relationship to the prestige framework. Asch, Block and 
Hertzman (1) have reached a similar conclusion: ‘‘It is our 
belief that the observer, in the absence of objective criteria, 
and in the face of the necessity of reaching some conclusion, 
proceeds to arrange a scale of preference in terms of some 
generally favorable or unfavorable impression. Further, that 
the same general impression functions similarly in fixing judg- 
ments of more specific characteristics.’’ Since our method, 
unlike theirs, does not force the formation of a gradient, the 
spontaneous character of this process seems convincingly 
proven. The complete control of the general framework over 
specific judgments is attested by the unusual size of our 
obtained correlations. 

Our method makes it possible to determine the significant 
elements in any stereotype, and to eliminate characteristics 
which do not contribute to a judgment relative to the given 
frame of reference. It is apparent, for example, that (in the 
minds of our subjects) ‘‘short hours”’ are not a sign of prestige. 
Similarly, prestige is not associated with personal qualities 
such as congeniality, kindness, honesty or their opposites. 

This technique might be extended to differentiate between 
frames of reference, as shown by changes in response. The 
fact, for example, that pleasantness correlates .92 with job 
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prestige and only .38 with personal prestige may indicate that 
the word ‘‘pleasant’’ has a different meaning in relation to 
persons from its meaning in relation to activity ; and that this 
difference is proof of a change in the frame of reference 
involved. 

The fact that we can obtain quantitative measures of the 
degree of association between gradients of judgment in this 
manner, and that such measures are high enough to satisfy the 
most exacting statistical requirements, suggests that the tech- 
nique here employed may be useful in the analysis of a wide 
range of semantic problems. Certainly our method is useful 
in the study of all kinds of socio-economic stereotypes.” 

The reliability of our data cannot be determined by the 
usual split-half technique, since we are dealing with each item 
rather than with a total score.* We have, however, two gradi- 
ents which appear in both forms (brains and excitingness) and 
which were apparently not much affected by the shift from 
**person’’ to ‘‘job’’ instructions. The thirty medians on these 
two gradients correlate .94 for the two groups of 50 men. The 
size of our obtained correlations tends to confirm this as a 
good estimate of reliability. Statisticians seem generally 
agreed that an upper limit to the correlation of two variables 
is set by the square root of the product of the two reliability 
coefficients. The reliability of the prestige ranking is .99, but 
even so, to obtain the coefficients listed in Table ITI, the gradi- 
ent medians must also have reliabilities high in the nineties. 

We believe, therefore, that the technique described in this 
article is valuable for the analysis of verbal frames of refer- 
ence. Students mark this type of test with extreme rapidity— 

2The authors are now engaged in a study of such concepts as war, 
patriotism, pacifism, nationalities and other related terms. Data on these 
have been collected from several hundred adults on two or more occasions 


and the shifts in meaning are being considered in relation to the events 
of the European War. 

3 The suggestion that half the subjects be correlated against the other 
half, made by the Journal Editor, came after the original data sheets 
had been discarded. 
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most of them finished our 150-item test in 15 minutes or less. 
It yields high reliabilities, is very sensitive, and is adaptable 
to a great variety of problems, which may be reduced to rela- 
tionships between a set of concepts and a set of gradients. 


SUMMARY 


1. An analysis of occupational stereotypes was made by 
pairing in rotation each of a list of 15 names of occupations 
with each of a set of ten characteristics in which these occu- 
pations might be thought of as differing. A second test form 
used the same occupations, but instructed the subjects to judge 
persons rather than jobs. 

2. General rankings for prestige, made after the test blank 
had been marked, correlate as highly as .99 with median judg- 
ments on the gradient test. 

3. Our subjects reacted in a significantly different fashion 
on the ‘‘person’’ and ‘‘job’’ forms, 

4. Prestige is imputed to occupations per se on the basis of 
such characteristics as hopefulness, being noticed, financial 
return, brains, excitingness and pleasantness. Prestige is 
imputed to men in specified jobs on the basis of brains, leader- 
ship, excitingness and self-assuredness. 

5. The conditions of the experiment exclude the possibility 
of conscious verbalization of a prestige frame of reference. 
We therefore conclude that the mere presentation of a set of 
occupational stereotypes for a series of judgments caused our 
subjects spontaneously to establish a prestige framework which 
then determined in a highly reliable manner judgments on the 
specific traits listed. 

6. The technique is practical and adaptable to a wide variety 
of investigations in the analysis of frames of reference influ- 
encing our judgments of social reality. 
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THE SELECTION OF DEPARTMENT STORE 
PACKERS AND WRAPPERS WITH THE 
AID OF CERTAIN PSYCHO- 
LOGICAL TESTS: 


STUDY It 


MILTON L. BLUM 
College of the City of New York and Vocational Service for Juniors 
AND 
BEATRICE CANDEE 
Vocational Service for Juniors 


IHE purpose of this study was to determine whether 
certain psychological tests could be used to predict the 
successful wrappers or packers in a department store. 

In a previous study (1), the authors concluded that experi- 
ence appeared to be of greater importance in the determination 
of success in these jobs than scores on the dexterity tests used. 
A zero correlation was obtained between the seasonal em- 
ployee’s production records and score on the finger dexterity 
test. A higher correlation (+.35) was obtained between pro- 
duction records and the Minnesota Placing test. In neither 
case was a serviceable critical score found for selecting workers. 

Since these findings on the finger dexterity in particular are 
at variance with results found by another store, it was decided 
to check them by conducting a similar study in still a different 
department store. Arrangements were made to give a battery 
consisting of the O’Connor Finger Dexterity, the Zeigler 
Placing, and in addition the Otis Self-Administering and the 
Minnesota Clerical to a group of applicants for seasonal work 


1Study I appeared in the JouRNAL oF APPLIED PsycHoLoey, Vol. 
XXV, No. 1, February, 1940. 
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as packers or wrappers. Production records and foremen’s 
ratings were the criteria available against which test per- 
formance was to be checked. A smaller group of permanent 
employees at the store was given the same hattery of tests and 
similar criteria were made available. Approximately 50 per 
cent of the employees volunteered to take the tests after a 
lengthy explanation that the purpose of the study was to test 
the tests rather than test the employees. 

The seasonals were referred as having had the test battery 
but no test results were given to the personnel authorities of 
the store. The results of the testing of the employees were not 
made available to the store authorities. In all, we tested 317 
seasonal employees and 55 permanent employees. 





Results 


Two closely related jobs, packing and wrapping, were in- 
vestigated. The packer does large bulky packaging, while the 
wrapper does the smaller units and also attends to the cashier- 
ing. The average test score for each job, sub-divided according 
to sex, was obtained and is presented in Table 1. 

A comparison of the average test scores between packers and 
wrappers indicates that only on the Otis test in the seasonal 
group do really reliable differences exist. The wrappers are 
superior on this test performance to the packers. The D/o diff. 
between the average scores of male seasonal packers and male 
seasonal wrappers is 4. In the female comparison, the D/o diff. 
is 3.3. The permanently employed wrapper obtains a score 
that is as equally superior to the permanently employed packer 
as in the seasonal employee comparison, but because of the 
limited number of subjects the D/o diff. is 1.4 for the males 
and 1.5 for the females. Apparently the interviewers choose 
the more mentally alert person for the job of wrapper and the 
less alert person for the job of packer. The placing test yields 
results which can be considered suggestive and along the 
same line. All four comparisons indicate the trend for the 
packers to be more capable on this test than the wrappers, even 
though no one comparison yields statistically reliable differ- 
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ences. The packers as a group tend to be lower on the mental 
test and higher in dexterity than the wrappers. A D/o diff. 
of 1.3 between the average scores of male seasonal packers and 
wrappers is obtained. In the female comparison the D/o diff. 
is 2.1. Between male permanent packers and male permanent 
wrappers the D/o diff. is .8 and in the comparison of the two 
female groups, the D/o diff. is 5.2. Results based upon the 
Minnesota Clerical and the Finger Dexterity test are equivocal. 
Whereas the male and female wrappers tend to show a slight 
superiority over male and female packers in the seasonal 
groups, this difference disappears among the permanent em- 
ployees. The D/o diff. on the finger dexterity test comparisons 
are negligible. The wrappers are slightly superior in the sea- 
sonal group, but the packers are better in the permanent group. 

The data presented in Table 1 also afford the opportunity to 
compare the test results of males and females performing simi- 
lar jobs. Among the seasonal employees, the male packers and 
wrappers obtain better scores on the Otis test than the girls. 
The D/o diff. between the averages are 2.4 and 3.7, respec- 
tively. The girls are better on the finger dexterity than the 
boys, with D/o diff. between the average scores of packers and 
wrappers of 3.2 and 4.6 respectively. With the permanent 
workers, the actual test differences tend to be just as large but 
because of the smaller number of subjects, these differences are 
not found to be statistically reliable. Possibly, the males must 
accept these jobs, whereas the females who possess equal mental 
alertness find employment in offices, ete. This would explain 
the sex difference in intelligence as here found. In our gui- 
dance division, as elsewhere, girls have been found to exceed 
boys on the Finger Dexterity and Minnesota Clerical tests. 
The results reported in this study are similar. 

Naturally, the crux of the problem of evaluating tests for 
selection is to compare test performance and criteria. As 
already mentioned, two criteria, supervisor’s ratings and pro- 
duction records, were available. Each employee is rated by 
his immediate supervisor according to the following scale: 
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A is excellent; B+ is very good; B is good; B- is fair; C is 
slow. These ratings were already in use by the store. Al- 
though the seasonal employees were rated according to this 
five-point scale, all permanent employees were rated as ‘‘A’’ 
or ‘‘B’’. This finding is similar to the one in our previous 
study (1). Apparently, foremen are only willing to undertake 
to distinguish between exceptional and good and rarely rate 
permanent employees lower. As noted in the previous article, 
the rating affects the probability of lay-off after the holidays 
and permanent employees are seldom rated in such a way as 
to jeopardize their jobs. In Table 2, the average scores on each 
test are presented according to the various rating categories. 

The differences between the averages on all tests of the male 
permanent wrappers and packers rated ‘‘A’’ and those rated 
‘‘B”’ are not statistically reliable. The same is true for the 
females rated ‘‘A’’ or ‘‘B’’. In other words, these tests do 
not differentiate the exceptional from the good permanent 
employees as determined by supervisors’ ratings. A compari- 
son of average test scores of ‘‘A’’ male and female permanent 
employees indicates that no statistically reliable differences 
exist. 

Among the seasonal employees, males rated ‘‘A’’ tend to 
be better than those rated ‘‘C’’ on all tests. However, only 
on the Minnesota Clerical does a difference between the means 
approaching statistical reliability exist. The D/o diff. be- 
tween the means on the number checking test is 3.4; on the 
name checking test this difference is 2.1. The females em- 
ployed as seasonals who are rated ‘‘A’’ or ‘‘C’’ are not dif- 
ferentiated on the basis of test performance. It is possible that 
factors which govern a supervisor’s rating of a male employee 
are not quite the same as those factors which govern a female 
rating. 

Although females usually obtain better scores than males 
on both parts of the Minnesota Clerical test, it was found that 
the males who receive ‘‘A’’ ratings tend to be better on these 
tests than the females with similar ratings. The D/o diff. 
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between the means are .7 on the numbers test and 2.3 on the 
names test. Even though the critical ratios are not reliable, 
the reversal in the expected direction might imply that male 
packers and wrappers are more likely to be successful if they 
are superior in these two measures, especially the Minnesota 
number checking test. Both males and females who obtained 
‘‘A”’ ratings tended to be superior to the ‘‘C’’ males and 
females on the finger dexterity and placing tests, but these 
differences are not statistically reliable as the D/o diff. in each 
of the four comparisons is less than 2. The Otis test was not 
found to be a factor in discriminating ‘‘A’’ from ‘‘C’’ 
workers. 

The second criterion available was the production record. 
It was found that the average number of packages wrapped 
by males and females was similar in both jobs. However, a 
statistically reliable difference was found to exist between the 
averages when the two jobs were compared. This is to be 
expected, since the wrappers also do cashiering. Correlations 
between productions and test results were therefore obtained 
for each job combining males and females. 

For the seasonal employees, all correlations were low and 
unreliable with the single exception of the correlation between 
production of packers and the placing test. Here the r is 
+.37 + .09. Establishment of a critical score at 234” or better 
was found to exclude 15 employees and only 1 of these had an 
average production record of over 95 units. Of the remaining 
65 employees with test scores better than 234”, 47 per cent 
achieved production records of over 95 units. 

The correlation between test scores and production records 
for the permanent employees tend to be higher than those for 
the seasonal employees. However, all but two are unreliable, 
as the sigmas are more than one-third the correlations. The 
two exceptions occur on the Minnesota Numbers and Packers’ 
production (+.57 + .12) and the Minnesota Names and Wrap- 
pers’ production (+ .65 + .14). 

It is difficult to comment on the meaning of the correlations 
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reported. With inexperienced workers employed as packers, 
a low but reliable relation is found between the placing test 
and production. This disappears among the experienced 
group. The finger dexterity shows no relation to production. 
Both of these results are in agreement with our earlier study 
(1) in a different store. Neither dexterity test shows any rela- 
tion to the wrappers’ job. The Otis test is unrelated to pro- 
duction records in either job. The Minnesota Clerical shows 
positive relations to both jobs in the experienced group. Ap- 
parently something represented by this test is valuable for 
good work on the job over a period of time. The decrease in 
differences in dexterity with experience might help to empha- 
size the differences in the clerical aspects of the jobs. 


SUMMARY 


1. Wrappers were found to be more mentally alert than 
packers. Apparently the interviewers are quite capable of 
selecting the more mentally alert persons. 

2. The packers were found to be faster on the placing test 
than the wrappers. 

3. Equivocal results were obtained when packers or wrap- 
pers were compared on performance on the Finger Dexterity 
and Minnesota Clerical tests. 

4. The males were found to obtain higher scores on the Otis 
test but possessed less finger dexterity ability than the females. 

5. The permanently employed rated by supervisors as ‘‘A’’ 
are not differentiated on the basis of the tests used in this 
study from those rated as ‘‘B’’. There are no ratings below 
this among the permanently employed. 

6. The male seasonal employees rated ‘‘A’’ tend to be 
slightly superior to those rated ‘‘C”’ on all test performances. 
The female seasonal employees rated ‘‘ A’”’ are slightly superior 
on three of the five measures obtained. 

7. All correlations between scores and production records 
for the seasonal employees were unreliable with one exception. 
The r between production and the Placing test for Packers 
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was +.37+.09. A favorable critical score of 234” was 
established. 

8. Among the permanent employees all correlation between 
test score and production record is unreliable except two. 
The r between the Minnesota Numbers and Packers’ produc- 
tion is +.57+.12. The r between Minnesota Names and 
Wrappers’ production is + .65 + .14. 


CONCLUSIONS 


1. Manual dexterity as measured by the two tests used here 
is not a selective factor in department store wrappers. With 
packers, relatively gross movement has some discriminating 
value in initial adjustment to the job, but this tends to disap- 
pear with experience. 

2. Clerical speed and accuracy seem to have a much higher 
relationship to production in the long run. Apparently initial 
adjustment to the job is influenced in packers to some extent 
by speed of relatively gross movement, but long term superi- 
ority on the job depends more on clerical ability with both 
wrappers and packers. 

















MECHANICAL ABILITY AS A FACTOR IN 
ENGINEERING APTITUDE’ 


EDWARD N. BRUSH 
University of Maine 


IHE purpose of the investigation which is reported here 
was to explore the possibilities of available tests of me- 
chanical ability and aptitude as indicators of aptitude 

for engineering. Such tests might well be considered partial 
measures of aptitude for engineering if they could be shown 
to bear a substantial relationship to academic success in a 
college of engineering. While validation of tests against ulti- 
mate professional success is much to be desired, such validation 
is beyond the scope of the present investigation. In fact, 
Kandel (14), in a recent study of professional aptitude tests, 
says, ‘‘No one has claimed, and it would be foolish to claim, 
that any aptitude test can predict ultimate professional suc- 
cess; all that it can be expected to predict is success in a well- 
defined course of professional preparation.”’ 

O’Connor (21) has stressed the importance of mechanical 
aptitude or ‘‘structural visualization’’ for success in engineer- 
ing training, and Bingham (3) regards the ability ‘‘to per- 
ceive the sizes, shapes and relations of objects in space and to 
think quickly and clearly about these relations’’ as a distinct 
asset to the student of engineering. Tests of mechanical 
ability? have been given in a few colleges of engineering (19) 


1 This study was made possible by a grant from the Coe Research 
Fund Committee of the University of Maine. The writer also wishes to 
acknowledge the cooperation of Dean Paul Cloke of the College of Tech- 
nology of the University of Maine. Several N. Y. A. student workers, 
notably Miss Josephine Freeman, have assisted in the computations. Dr. 
Lillian Hatfield Brush assisted in the preparation of the manuscript. 

2 Because of the empiricai nature of this study the writer does not wish 
to raise the issue here as to the difference between mechanical ‘‘apti- 
tude’’ and mechanical ‘‘ability.’’ Hence the terms will be used inter- 
changeably. 


300 























MECHANICAL ABILITY 301 


but there have been few reports in the literature on the results 
of such tests. 


SURVEY OF THE LITERATURE 


Many of the published studies on engineering aptitude deal 
with the use of intelligence, scholastic aptitude, placement or 
achievement tests, and high school grades, used singly and in 
combination, to predict scholastic achievement in colleges of 
engineering (e.g., 1, 6, 7, 15). Feder and Adler (6) report 
coefficients of multiple correlation as high as .77 between com- 
binations of such tests and first semester grades, whereas ‘‘in 
general, prediction coefficients in education range between .40 
and .60.’’ Wagner (31) reports a median R of .67 between 
college success and combinations of prognostic measures in a 
survey of the literature on college performance prediction. 
These values represent a considerable improvement over the 
coefficients of correlation of from .30 to .45 obtained with the 
Thurstone Vocational Guidance Tests for Engineers in the 
early 1920’s (19). 

High school grades have been found by several investigators 
to yield better prediction of college achievement than test 
results (8, 29), but there are serious difficulties in dealing with 
data of this kind where the students come from many high 
schools. As Jones and Brown (13) point out ‘‘the conditions 
in different schools and localities are so different that any pre- 
dictions based on facts from school records will vary greatly 
from place to place.’’ In some of the studies of engineering 
aptitude, grades in particular subjects are considered in their 
relation to college success. Dvorak and Sayler (5) report the 
following r’s with university freshman average: high school 
mathematics, .488; high school natural science, .457; the Uni- 
versity of Washington Intelligence Test, .374; Iowa Mathe- 
matics Aptitude and Training Tests, .577; Iowa Physics Apti- 
tude and Training Tests, .546. Layeock and Hutcheon (16) 
found an r of .61 between average Grade XII marks and aver- 
age first year engineering marks, whereas the r between the 
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latter and the American Council Psychological Examination 
was .34. 

Some attention has been given to quality of work in fresh- 
man engineering courses as an indicator of success in advanced 
work. Wilson and Hodges (32) report the following r’s of 
freshman grades with grades in advanced engineering courses: 
all freshman engineering grades, .584; mathematics, .630; 
mechanical drawing, .453; chemistry, .356. The multiple R, 
based on all of these and the Otis Advanced Intelligence Scale, 
was .690. Ayer (2) found that 91.7 per cent of those who 
graduated failed neither physics nor mathematics, whereas 
only 10.6 per cent of those who graduated failed either or 
both of these subjects. Higgins (9) reports a close relation- 
ship between work in first year mathematics and average grade 
for the entire four years. 

There have been several studies dealing with the prediction 
of grades in particular courses in the engineering curriculum. 
According to Paterson et al. (23), Mann has found an r of .63 
between his Staticube Test of Power to Visualize and grades 
in descriptive geometry, while his Dynamicube and Mutilated 
Cubes tests yield somewhat lower r’s with descriptive geometry 
grades. McCauley (17) obtained r’s ranging from .58 to .71 
between his Tetrahedron Test of Power to Visualize and 
grades in descriptive geometry. Stuit (29) found the Iowa 
Physics Test and the Mann Mutilated Cubes Test to be the 
best of a number of measures tried for predicting success in 
college physics. Horton (11) has studied the MacQuarrie 
Test for Mechanical Ability, the Army Beta Test and the Iowa 
Mathematics Aptitude Examination in their relations to suc- 
cess in freshman engineering courses. The highest r’s re- 
ported are those between Army Beta and MacQuarrie total 
scores and grades in engineering drawing (.43 and .44, respec- 
tively). Correlations between sub-test scores and engineering 
drawing grades range from .13 to .40. Predictive combina- 
tions, including high school grades as well as test scores, yield 
multiple coefficients of .58 for chemistry, .55 for engineering 











MECHANICAL ABILITY 303 


drawing, .52 for descriptive geometry and .42 for mathe- 
matics. Harris (8) cites a number of studies reporting r’s of 
from .41 to .66 between various scholastic aptitude, placement 
and intelligence tests, and grades in physics, chemistry and 
mathematics. 

Similar to the present investigation, in that they have in- 
eluded tests of mechanical ability among their proposed 
measures of engineering aptitude, are the researches of Hol- 
comb and Laslett (10) and of Laycock and Hutcheon (16). 
Holcomb and Laslett report r’s with college grades in a school 
of engineering as follows: MacQuarrie Test for Mechanical 
Ability, .478; Stenquist Mechanical Aptitude Test No. 1, .146; 
Stenquist No. 2, .428; Stenquist Assembly Test, .164; Strong 
Vocational Interest Blank (Engineer Seale), .322. No coeffi- 
cients of multiple correlation are given. Laycock and 
Hutcheon find the following r’s with first year marks: the 
Form Relations Test of the N. I. I. P., .25; Cox Mechanical 
Aptitude Test M2 (Models), .16; Cox Mechanical Aptitude 
Test D (Diagrams), .14; Physical Science Interest Score on 
the Thurstone Interest Inventory, .26. A battery including 
Grade XII marks, American Council Psychological Examina- 
tion, Physical Science Interest and Form Relations yielded a 
coefficient of multiple correlation of .66. 

Except for the instances cited above, the tests of mechanical 
ability used in this investigation have, in general, been vali- 
dated against other criteria than engineering college grades. 
Remmers and Smith (25) report an r of —.30 with descriptive 
geometry for O’Connor’s Wiggly Block Test. The Minnesota 
Paper Form Board is reported to yield an r of .33 with me- 
chanical drawing grades, and the revised form of this test has 
correlated to the extent of .49 and .32 with mechanical draw- 
ing and descriptive geometry grades (24). 


PLAN OF THE INVESTIGATION 


In the present study the plan was to administer a number 
of tests of mechanical ability to first year students in the Col- 
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lege of Technology at the University of Maine, and to deter- 
mine the prognostic value of each test as well as that of com- 
binations of tests. The criterion was scholastic rank (point 
average) in courses of an engineering nature. Grades in 
English, public speaking and other courses in the humanities 
and social sciences were excluded in computing the averages. 
Whereas most published studies on engineering aptitude have 
used first semester or first year grades as criteria, in this study 
the average grades for the individual student’s whole aca- 
demic career in the College of Technology were also used.* 

Two groups of students served as subjects. Group A in- 
cludes 104 members of the Class of 1933. Group B is com- 
posed of members of the Class of 1935. In the latter group the 
total population was 160, but since it was not possible for all 
students to take all of the tests the r’s were computed on the 
basis of all available cases. The N’s range from 77 to 160, 
with a median of 127. No known selective factors were opera- 
tive in determining which students missed particular tests. 

Students in Group A took the tests during the second 
semester of the freshman year. The following tests were 
administered individually : 
Minnesota Paper Form Board, Form A or Form B 
Minnesota Assembly Test, Short Form Set I 
Minnesota Spatial Relations Test, Boards A and B 
O’Connor Worksample No. 1, Clerical Aptitude 
O’Connor Worksample No. 5, Mechanical Aptitude 

(Wiggly Block) 

6. O’Connor Worksample No. 72, Mechanical Aptitude Plus 

Mechanical Knowledge (Dividers) 

The following tests were administered to Group A in group 
form: 

1. Cox Mechanical Explanation Test (E3) 

2. Cox Mechanical Completion Test (C) 

3 Of the students in Group A 59.6% finished four years, 76.9% com- 
pleted at least three years, and 87.5% completed two or more years. Of 


Group B 46.9% completed four years, 54.4% three or more years, and 
69.4% two or more years. 
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3. Cox Mechanical Models Test (M7) 

4. MacQuarrie Test for Mechanical Ability 

The following tests, given to Group B, were all given as 
group tests, and were administered during the early part of 
the first semester of the freshman year: 

1. Minnesota Paper Form Board, Form A or Form B 

2. Cox Mechanical Completion Test (C) 

3. Cox Mechanical Models Test (M7) 

4. Minnesota Interest Analysis Blank 

In addition to the above, the following tests, given during 
Freshman Week, were available for inclusion in the data on 


Group B:* 
1. Thorndike Intelligence Examination for High School 
Graduates 
2. Columbia Research Bureau Algebra Test; Form B 
3. Columbia Research Bureau Chemistry Test; Form A 
4. Columbia Research Bureau Plane Geometry Test ; Form A 
5. Columbia Research Bureau Physics Test ; Form A 


RESULTS 


In Table 1 are presented the zero-order coefficients of corre- 
lation between each of the tests given to Group A and scho- 
lastic rank for the first year, the r’s between each test and 
total or accumulative rank, and the reliability coefficients. 
The latter are based upon split halves of tests, and have been 
‘‘stepped up’’ by means of the Spearman-Brown formula. 

In Table 2 coefficients of multiple correlation between 
various combinations of tests and total or accumulative rank 
for Group A are presented. 

Table 3 gives the zero-order coefficients of correlation be- 
tween tests and criteria for Group B. 

Table 4 gives coefficients of multiple correlation between 
various combinations of tests and the criterion of final or 
accumulative rank for Group B. 


4The writer is indebted to Dr. J. R. Crawford for permission to use 
this material. 
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In a preliminary investigation with a different group of 


first-year engineering students the Stenquist Picture Tests of 
Mechanical Ability, Nos. 1 and 2, and O’Connor’s Work- 
sample No. 75 (Formboard) were given. Scores were corre- 
lated with grades for three semesters, with results as follows: 




















r N 
Stenquist Picture Test No. 1 ............. 268 + .070 81 
Stenquist Picture Test No. 2 ............... -190 + .074 78 
O’Connor Worksample No, 75 ................ -255 + .077 71 
TABLE 1 
Coefficients of Correlation between Tests and Criteria, and 
Coefficients of Reliability 
Group A 
r with r with 
Variable 1st Year Total Ty 
Grades Grades 
1. First year grades 910+.011 .917+.011 
2. Minnesota Paper Form Board .420+.055 .426+.054 .932+.009 
3. Minnesota Assembly Test. ........ 278+ .061 .274+.061 .651+.038 
4. Minnesota Spatial Relations .. .064+.066 .056+.066 .768+.027 
5. Minnesota Battery E ................. 392 + 056 = 353+ O58  rericennennn 
6. O’Connor No. 1 (Clerical) ... .171+.064 .170+.064 .784+.025 
7. O’Connor No. 5 (Wiggly 
Block) 281+ .061 .273+.061 .755+.028* 
8. O’Connor No. 72 (Dividers) .154+.065 1254+ .065 oc 
9. Cox Mechanical Explanation .. .323+.059 .336+.059 .561+.045 
10. Cox Mechanical Completion... 390+ .056 .323+.059 .708+.033 
11. Cox Mechanical Models .............. .425+.054 .394+.056 .857+.018 
12. MaecQuarrie Test (Total 
Score) 249+ .062 .219+.063 .962+ .005t 
13. MacQuarrie Tracing 135+ 065 090 + 066  nnnnnun 
14. MaeQuarrie Tapping 013 + 066 010+ 066 ccnen 
15. MaeQuarrie Dotting 0.0.00... —.047 + 066 —.0384 + 066 icccceccinun 
16. MacQuarrie Copying ................ 265 + 062 276+ OGD ieeccerersmnnen 
17. MacQuarrie Location ................ -240+.062 .183 + .064 
18. MacQuarrie Blocks ...................... 246+ .062 .223+ .063 
19. MacQuarrie Pursuit. ................... -107 + .066 .138 + .065 








* First and third trials against second and fourth trials. 


t Because of the shortness and prominence of the speed faetor in the 


MacQuarrie subtests, the reliability coefficients (based on split halves) are 


judged to be spuriously high and hence are not given above. 


for the test as a whole is, of course, spuriously high also. 


The value 
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TABLE 2 


Coefficients of Multiple Correlation between Various Combinations of 
Tests and Total Grades 
Group A 





Combination of Tests R 





1. All tests (except MacQuarrie Tapping and Dotting) .544+.047 
2. Group tests. Cox Models, Cox Explanation, Cox 

Completion, MacQuarrie Copying, Blocks, Location, 

Pursuit and Tapping 458 + .052 
3. Individual tests. Minn. Paper Form Board, Minn. 

Assembly, Minn. Spatial Relations, O’Connor No. 


























1, No. 5 and No. 72 482 + .051 
4. Cox Models, O’Connor No. 1 and No. 72, MacQuarrie 
Copying and Pursuit 445 + .053 
5. MaeQuarrie Copying, Blocks, manila Pursuit and 
Tracing .301 + .060 
6. O'Connor No. 1, No. 5 amd NO. 72 occcccccccossossssresessnenssenesen 325 + .059 
7. Minn. Paper Form Board and Cox Models ..................... 464 + .052 
8. Minn. Paper Form Board and Minn. Assembly ......... 434 + .054 
9. Cox Models, Cox Explanation and Cox Completion ..... 421+ .054 
TABLE 3 
Coefficients of Correlation between Tests and Criteria 
Group B 

r with r with 

Variable 1st Sem. Total 

Grades Grades 
1. First semester grades ................. thee 
2. Thorndike Intelligence Examination aoe 470+ .042 .434+.044 
3. Minnesota Paper Form Board .......... 175+.057 .214+.056 
4. Cox Mechanical Completion ........................ .364+.051 .351+.051 
5. Cox Mechanical Modeks .......0......cccccccccssse 403 +.053 .406 +.053 
6. Minnesota Interest Analysis Blank ....... .208 + .064 .192+ .064 
7. ©. BR. B. Chemistry Test ccccccccccccccsne 430+ .047 329+ .052 
8. C. R. B. Physics Test 589 + .038 .500 + .043 
9. C. R. B. Algebra Test ........ msneee O0¢089 5134 .040 
10. C. R. B. Plane Geometry Test . ‘alata 323+ .048 .380+ .046 
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TABLE 4 


Coefficients of Multiple Correlation between Various Combinations of 
Tests and Total Grades 











Group B 
Combination of Tests R 
1. All tests -610 + .038 
2. Cox Models, Cox Completion, Minn. Paper Form 
Board and Minn. Interest Analysis ....ccccccoo 458 + .047 


3. C. R. B. Physics, Chemistry, Algebra and Geometry .585 + .039 
4. Thorndike, C. R. B. Algebra, C. R. B. Geometry, 
Cox Models, Cox Completion, Minn. Paper Form 





Board and Minn, Interest Analysis 2.00.00... -593 + .039 
5. Thorndike, C. R. B. Algebra, Minn. Paper Form 

Board and Minn. Interest Analysis 00.00.00... 571 + .040 
6. Thorndike, C. R. B. Algebra, Cox Models and Minn. 

Interest Analysis .588 + .039 


7. Thorndike, Cox Models and Minn. Interest Analysis .514+ .044 
8. Thorndike, Cox Completion and Minn. Interest 

















Analysis 491 + .045 
9. Thorndike, Minn. Paper Form Board and Minn. 
Interest Analysis 469 + .047 
10. Thorndike, Cox Models, Cox Completion and Minn. 
Paper Form Board 496 + .045 
11. All tests and first semester grades 850 + .017 
DISCUSSION 


Of the mechanical ability tests used in this study the Cox 
tests and the Minnesota Paper Form Board appear to be the 
most valuable as partial measures of engineering aptitude. 
Results with both groups in this study are more favorable to 
the Cox tests than the results obtained by Laycock and 
Hutcheon (16). The Cox Models Test is, as Oakley (20) has 
pointed out, rather difficult to work with and in need of some 
revision to meet practical requirements. The scoring of this 
test was found to be very time-consuming. The Minnesota 
Paper Form Board yielded better results when used as an 
individual test than when administered in group form. Under 
the former conditions it compares favorably with the revised 
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form of the test, to judge from the correlations obtained by 
Quasha and Likert (24). It also has the advantage of being 
a more reliable test than most of the others used in this study. 

The MacQuarrie Test proved less useful in this investiga- 
tion than Holcomb and Laslett (10) found it to be, but a num- 
ber of the sub-tests are of some value. The Minnesota Assem- 
bly and O’Connor’s Wiggly Block appear to be of very slight 
value as single tests, and are not of very satisfactory relia- 
bility. 

In general it appears that mechanical ability, as tested by 
some of the current tests, bears a significant relationship to 
success in engineering courses, although in the case of most of 
the tests used in this research the relationship is not close 
enough to make single tests particularly useful in prediction. 
Results with batteries of selected mechanical ability tests are, 
however, considerably better. If one follows the rather broad 
conception of aptitude sponsored by Bingham (3), some evi- 
dence of engineering aptitude is to be found in tests (or test 
batteries) of this type as well as in tests of intelligence, scho- 
lastic aptitude and achievement, in grades in certain high 
school courses, and, in fact, in any other data which can be 
shown to be symptomatic or indicative of potentialities for 
the study of engineering. 

The use of an interest blank in combination with other 
measures appears to be a promising technique, especially in 
view of the fact that interest scores, in this investigation at 
least, have relatively low intercorrelations with other tests in 
the battery. A revised scoring of the Minnesota Interest 
Analysis Blank gave an r of .43 with grades for the criterion 
group (Group B), but this would be expected to shrink some- 
what when applied to a different group (12). It appears that 
r’s in the .30’s can be expected between scores on the Engineer 
Seale of the Strong Vocational Interest Blank 10) and engi- 
neering grades, although Strong (28) does not expect that 
‘‘interest scores will correlate particularly with school 
grades.”’ 
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The results obtained in this study are in rather striking 
agreement with Segal (26) who concludes, from a summary of 
many studies, that for prediction of general college scholar- 
ship general achievement tests are best (median r is .545), 
general mental tests next (median r, .44), while tests of spe- 
cific aptitudes or achievements come third (median r, .37). 

The close relation found in this investigation between first 
year grades and the whole scholastic record bears out the find- 
ings of several workers (2, 9, 32) on the importance of achieve- 
ment in the first year mathematics and science courses as indi- 
cators of success in advanced engineering studies. 


SUMMARY AND CONCLUSIONS 


1. Mechanical ability, as measured by several current tests, 
may be regarded as a component of engineering aptitude. 

2. The actual predictive power of most single tests of 
mechanical ability is not great. Of those used, the more 
promising are the Cox Tests of Mechanical Aptitude and the 
Minnesota Paper Form Board. Rather low correlations with 
the criterion of success in engineering courses were found for 
the Minnesota Assembly and Spatial Relations Tests, the Mac- 
Quarrie Test, the O’Connor Worksamples and the Stenquist 
Picture Tests. 

3. Batteries of mechanical ability tests yield R’s with the 
criterion of from .301 to .544; batteries in which an intelli- 
gence test is combined with one or two tests of mechanical 
ability yield R’s of from .495 to .510. Several batteries of 
two, three or more of the mechanical ability tests predict engi- 
neering scholarship at least as well as the intelligence test. 

4. The achievement tests, singly and in combination, predict 
success in engineering studies somewhat better than do the 
tests of mechanical ability. 

5. First semester and first year grades are more closely cor- 
related with the total engineering college record than is any 
test or combination of tests. 
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THE USE OF SOME TESTS IN THE PREDICTION 
OF LEGAL APTITUDE 


BRITTEN L. RIKER anp FREDERICK J. GAUDET 
University of Newark 


LTHOUGH some form of selection exists in all profes- 

A sions, the law has lagged behind most of the others in 

standardizing its selective techniques. Among the 

chief reasons for this situation are the philosophy of individ- 

ualism in our political institutions, the non-internationality of 

the profession, and the fact that each state has its own system 
of law. 

Nevertheless, except for a few states which lowered their 
requirements during the Jeffersonian period, there has always 
been some form of selection for the legal profession in the 
United States. The attempt to standardize on a national basis 
the requirements for admission to the profession dates from 
the American Bar Association’s acceptance in 1921 of the Root 
Report, which recommended that no student be admitted to 
law school without having completed at least two years of col- 
lege work. Accordingly, all accredited schools of law adopted 
this prerequisite.’ 

Many lawyers, however, and many law schools were not sat- 
isfied with these minimum requirements. The profession felt 
that its standards were being lowered by unethical practices 
caused by over-competition and began to realize that although 
bar examiners might pass only 30 to 50 per cent of the candi- 
dates in any one examination, the percentage of those finally 
admitted was very high.” 

1 Many institutions had maintained similar or higher admission require- 
ments before the adoption of the Root Report. 


2 Several studies have shown that approximately 90 per cent eventually 
pass the bar examination. 
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The law schools were concerned not only about the standards 
of the legal profession as a whole, but also about their high 
student mortality. Failures in the first year of law school 
frequently mounted to 30 or even 50 per cent. The schools 
felt that this large percentage of failures caused suffering to 
the individuals who were ‘‘flunked out,’’ that the standards 
of the class-room work were lowered by the large percentage 
of poor material in these classes, and that a number of ill- 
qualified students were probably graduated, to the public det- 
riment. Hence both the profession and the schools began to 
feel that new techniques of selection should be introduced. As 
a result, some schools increased their minimum requirements 
from two to either three or four years of college. Others be- 
gan to study the relation between success in law school and 
quality (average grades) or type (major field) of the student’s 
college work. Still others began to experiment with psycho- 
logical tests. 

The first recorded efforts to utilize psychological tests in 
predicting legal success were conducted by the School of Law 
of Columbia University in 1921. This employed a test devised 
by Thorndike which measured ‘‘the capacity on the part of 
the student to work effectively with abstractions and symbols, 
the kind of work required of law school students’’ (11). Simi- 
lar tests are still used at the Columbia School of Law, and are 
given considerable weight in admitting or excluding students 
whose average college grades are below B. 

In 1925 Ferson and Stoddard published another test of 
legal aptitude, the Ferson-Stoddard Law Aptitude Examina- 
tion.* A summary of previous studies of the use of this test 
follows. 

3 Since the publication of the Ferson-Stoddard Examination at least ten 
other tests have been used in attempts to predict legal aptitude. Although 
some of these tests have proved themselves to have higher coefficients of 
validity than the Ferson-Stoddard, these coefficients are obtained in schools 
using tests which were standardized in the same institution. It is prob- 
able that the variation in students from year to year in the same institution 
is less than the variation between one law school and another. Moreover, 
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The first study of the validity of this test was conducted by 
Stoddard (7) who used 100 students in the Law School of the 
University of Iowa. The coefficients of correlation between 
scores on the test and first-semester law school grades was 
+.547 and between test scores and first-year scholarship was 
+ .539. Stoddard points out that Part 2 has the most value as 
a predictor of first-year law school achievement and Part 3 the 
least value. 

In 1929 Wigmore (8) published the results of an investiga- 
tion which he conducted at Northwestern University on the 
relation between success on the Ferson-Stoddard and law 
school grades. His subjects were 50 volunteer law school stu- 
dents, and his criteria of law school success were the per- 
centages of A, and of A plus B grades obtained by thes2 
students. He complained that those students who make these 
high grades did not consistently fall in the first quartile of the 
test scores. The next year Wigmore (9) published a further 
analysis of the predictive ability of this test. In this study 
he pointed out that the scores in the lowest quartile are better 
predictors than those in the highest quartile. Wigmore was 
very much dissatisfied with the test. However, Crawford (1) 
and Eagleton (2) in re-examining Wigmore’s data are much 
more optimistic regarding the value of the test as an instru- 
ment of selection, although they are perfectly willing to grant 
Wigmore’s conclusion that there is no one-to-one relationship 
between law school grades and scores on the test. 

Following Wigmore’s criticism of the value of the test, 
Gaudet and Marryott (4) compared scores on the test with the 
law school grades of 246 students in New Jersey Law School. 
A coefficient of correlation of + .42 was obtained between first- 
year grades and test scores. When the scores were ranked in 
fifths, the first fifth being the highest, 82.4 per cent of those 





the Ferson-Stoddard is the only specific test of legal aptitude which is 
available for general circulation. It is published by the West Publishing 
Company, St. Paul, Minnesota, and is furnished without charge to law 
school deans who request it. 
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in the first fifth succeeded in law school, while 61.2 per cent 
of those in the lowest fifth either failed or discontinued. 

In the same year Witham (10) published the results of an 
investigation conducted at the University of Tennessee Law 
School. From the data presented a coefficient of correlation 
of + .54 is obtained when test scores are correlated with first- 
year grades. 

Eagleton (2) in 1932 reported a coefficient of correlation 
of + .49 when test scores were correlated with the first-year law 
school grades of 117 students of the University of Chicago Law 
School. He also found a coefficient of correlation of + .58 
between Part IV of the test and grades. 

The most recent study of the validity of the Ferson-Stoddard 
was conducted by Harrell (6) who obtained a coefficient of 
correlation of +.46 between scores on the test and first-term 
law school grades. 

The only study of the reliability of the Ferson-Stoddard 
which has been reported was that conducted by Gaudet and 
Riker (5). This was based upon 200 students of the Univer- 
sity of Newark Law School and yielded a'coefficient of reliabil- 
ity of +.921 (corrected by the use of the Spearman-Brown 
formula) using odd and even items. 


DESCRIPTION OF THE FERSON-STODDARD EXAMINATION 


Although the first edition of this test was published in 1925, most of the 
studies of its validity have been made with the 1927 version. The test is 
made up of four parts: 

Part1. Verbal memory. The student is asked to read a passage describ- 
ing a poisoning case and the ensuing litigation over a reward 
claim. The student is not told that he will be questioned on its 
contents. After taking the other three parts of the complete 
test, he is then questioned on the substance of this passage. 
Time—4 min. 

Part 2. Reading comprehension. After reading the report of a trial, 
the candidate is asked to answer true-false, right-wrong, and 
numerical completion questions regarding it. Time—20 min. 

Part 3. Syllogistic reasoning. The candidate is given the premises and 
a conclusion. He is to indicate whether the conclusion iv true or 
false. Time—20 min. 
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Part 4. Reading comprehension. This is a matching test wherein the 
subject reads a passage containing certain marked phrases. He 
then matches these with a series of statements. Time—20 min. 


From the above survey of previous studies of the test, it 
appears that the coefficients of validity between tests of legal 
aptitude and law school success are comparable to those be- 
tween scholastic aptitude tests and actual grades. Some col- 
lege administrators consider the coefficients obtained too low 
to warrant using the test as a selective device, but others sug- 
gest that the prediction of law school success is possible by a 
more judicial consideration of the test results. 

It is obvious of course that success in first-year law school 
is a better indicator of success in future legal training than 
scores on this test of legal aptitude. However, the two mea- 
sures are not at all comparable. In the case of the legal apti- 
tude tests, the prospective student is told of his chances for 
success or failure before beginning the training. By the time 
grades in first year of law school can be used predictively, the 
student whose marks are low is in no enviable position. Al- 
though the degree of predictability of the test is less than that 
of first-year grades, the legal aptitude test has many distinct 
advantages over the other measure. From the point of view 
of the student it means that he may be deterred from spending 
a year or more of his life studying for a profession he cannot 
enter, and saved from the humiliation which results when he 
is prevented from continuing his studies. From the point of 
view of the law school it means a decreased student mortality 
and an increase in the average quality of class membership. 
Finally, the public is probably benefitted by a decrease in the 
number of inadequately trained individuals admitted to the 
bar. 

THE PRESENT STUDY 


This study is devoted to an analysis of the relation between 
law school grades and scores on four tests* which were admin- 


4The Ferson-Stoddard Law Aptitude Examination; The Dearborn 
Group Test, Examination C; The Otis Self-Administering Test of Mental 
Ability, Form D; and the Inglis Test of English Vocabulary. 
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istered to 180 students entering the New Jersey Law School.® 
Only one of these tests has been reported as being used previ- 
ously in predicting law school achievement—the Ferson- 
Stoddard Law Aptitude Examination. 


TABLE 1 
Validity of the Tests 
(Product Moment Coefficients. N=180) 









































Average Grades 
Tests 
Freshman Junior Senior All years 

Ferson-Stoddard 

Part 1 294 244 189 -290 

Part 2 337 224 -203 304 

Part 3 307 198 -100 260 

Part 4 127 -187 151 182 

Total 338 259 194 321 
Dearborn Games & Puzzles 

Part 1 106 .008 051 074 

Part 2 237 125 145 177 

Part 3 128 .098 044 112 

Part 4 -157 140 051 145 

Total 203 133 103 181 
Inglis Vocabulary «0.0.0.0... .220 -206 176 245 
Otis Self-Administering ..... 263 175 -160 255 





The validity of each of these tests is presented in Table 1. 
The grades are not the average of raw grades, but the averages 
of 4Q equivalents, which were obtained for each instructor. 
It is apparent that the Ferson-Stoddard Law Examination is 
a better predictor of law school success than is any of the other 
three tests. It is superior in predicting not only the average 
grades of all three years, but also success in each of the three 
years of law school work. 

The correlations between scores on the Ferson-Stoddard and 
grades in law school will be observed to be lower than those 
reported by other studies in other institutions. This may be 

5 Now the Law School of the University of Newark. 
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due to several factors: (1) type of students,® (2) the type of 
teaching, and (3) the validity and reliability of the system of 
marking. This third factor is apparently a significant one 
with these data, for the correlations between grades received 
in one year in law school and those received in other years are 
relatively low. (See Table 2.) If the grades in one year cor- 
relate in the fifties and sixties with grades in other years, it is 
apparent that our criterion lacks either in validity or reliabil- 
ity. Hence it is probable that if the marking system were 
better, the correlation between test scores and average grades 
would be higher. 
TABLE 2 


Correlations between Average Grades in Different Years in Law School 





Freshman Junior Senior All years* 











Freshman 605 546 818 
Junior 641 841 
Senior 865 








* These correlations are spuriously high, of course, since they include 
the grades of the year with which they are being correlated. 


The Dearborn test was used purely experimentally. It is 
apparent that, considering the time required for its adminis- 
tration and scoring, its value is dubious. The Inglis and Otis 
(both used for experimental purposes) are of approximately 
equal value, and require about the same time for administer- 
ing and scoring. 

When we examine the parts of the Ferson-Stoddard, it 
appears that Part 2 has the highest validity and Part 4 has 
the lowest. Part 2 of the Dearborn has promise, since it cor- 
related better than Part 4 of the Ferson-Stoddard with the 
average first-year grades and almost as well with total grades. 
It will also be observed that this part of the test gives a higher 
coefficient of correlation with first-year grades than does the 
total Dearborn score or the Inglis score, and that it correlates 


6 All of these students have had at least two years of college work. 
Many had more than this and held college degrees. 
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almost as highly with average grades for all years as does the 
total Dearborn score. It is interesting to note that Parts 1 
and 3 of the Dearborn were much too easy, so that many stu- 
dents received maximum scores. A shortening of the time 
limit for the Dearborn test might result in an increased coeffi- 
cient of correlation. However, an examination of the scatter 
diagrams of scores on Parts 1 and 3 with grades in the various 
years indicates that these two tests do not predict well in the 
lower range of scores. It is also of interest to observe that 
these two tests are the only purely non-language tests used in 
this battery. 

Any judgment of the relative value of these tests is also 
somewhat dependent upon the inter-correlations between them. 
Table 3 indicates that apparently no two of them are measur- 
ing the same quality, since the correlations rank from + .367 
(Inglis vs. Ferson-Stoddard) to +.645 (Dearborn vs. Otis). 
Part 2 of the Dearborn (the most valid) also does not correlate 
highly enough with any sub-test to indicate that it measures 
the same trait as any of them. 


SUMMARY 


A study of four tests as predictors of law school success 
shows that each has some value, although the coefficients are 
lower than those usually obtained between test scores and 
grades. It isapparent that Part 4 of the Ferson-Stoddard has 
a low validity and that Part 2 of the Dearborn is of sufficiently 
high validity to make further studies worth while. The 
Inglis and Otis are better than Part 4 of the Ferson-Stoddard 
and better than the Dearborn, but it is doubtful whether they 
are sufficiently more valid to justify the time required for their 
administration and scoring. 
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THE RELIABILITY AND VALIDITY OF 
TWO GRAPHOLOGISTS 


BLAKE CRIDER 
Fenn College 


ECENTLY I have been able to work with two profes- 
R sional graphologists who gave complete cooperation in 
this investigation. The first graphologist (G1) had 
practiced graphology professionally for some twenty years. 
The second graphologist (G2) had practiced graphology semi- 
professionally for some ten years and at thirty years of age had 
gone back to college, majored in psychology, and eventually 
took the M.A. in vocational guidance in a leading graduate 
school. 

I first took samples of handwriting according to instructions 
given by the graphologists and which were used by both of 
them in their analyses. Then I had a stenographer take down 
verbatim their analyses. These written analyses seemed aston- 
ishingly correct to me and the subjects. But I soon discovered 
that I could not handle the results quantitatively for two rea- 
sons. In the first place they used terms which I did not under- 
stand ; in the second place they spoke in such generalities that 
it was hardly possible for them to be wrong. 

Next I gave to 18 young adults thirteen standardized psy- 
chological tests which gave 16 different scores. I then wrote 
out in detail the psychologists’ conception of the traits these 
tests purported to measure. I went over these until I thought 
G1 understood my definition. G2 of course had no difficulty 
since she was at this time trained in psychology. Each graph- 
ologist ranked each of the 18 subjects in each of the 16 traits 
measured by the tests. G1 repeated his ranks one month after 
his first ranking. I ranked each subject in the 16 traits ac- 
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cording to his scores on the psychological tests. Correlations 
were run by the rank order method. The results and the 
number of ranks on which the correlations are made are in- 
cluded in the accompanying table. 

Three observations may be made from these results: (1) 
The correlations indicate the graphologists do not agree with 
what the psychological tests purport to measure. (2) The two 
graphologists do not agree with each other. (3) G1 agrees 
highly with himself, indicating that whatever he ranks he 
ranks consistently. 

















‘‘APTITUDE”’ AND TRAINING: A SUGGESTED 
RESTANDARDIZATION OF THE K-D* 
MUSIC TEST NORMS 


G. M. GILBERT 
Bard College, Columbia University 


HE questionable validity of most aptitude tests is a prob- 
fi: lem which has long been a stumbling block in the prac- 
tical use of such tests. The solution, of course, will be 
arrived at only gradually in the process of construction, ad- 
ministration, reconstruction, standardization, validation, and 
revalidation of each individual test, and the problem of obtain- 
ing valid independent criteria may always remain. However, 
the control of one important factor—that of training—seems 
to have been almost entirely ignored by the test-makers, and 
this, in the writer’s estimation, has undoubtedly contributed 
to the low order of validity coefficients for aptitude tests in 
general. 

There is hardly a function which can be used as a criterion 
of any aptitude—which means hardly any function of the 
human organism, whether it involves sensory acuity, motor 
coordination, manual dexterity, use of symbols, aesthetic judg- 
ment, or anything else—which is not subject in some degree 
to the influence of training. ‘‘Aptitude’’ presumably means 
the relative ability of individuals to benefit from such train- 
ing, or practice effect, and presupposes no training, or ‘‘nor- 
mal’’ training, as the case may be, prior to the test. If apti- 
tude is at all distinguishable from achievement, the training 
factor must be controlled in setting up norms for the interpre- 
tation of scores, as well as in the construction of the test itself. 
Achievement, as measured by the score on any test, or by any 

1 Kwalwasser-Dykema Music Test. 
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other criterion, is necessarily the product of both aptitude and 
training, or, ultimately, heredity and environment. That 
being the case, subjects with poor aptitude who have had con- 
siderable training will frequently obtain scores as high as some 
with considerable talent but with little or no training. Like- 
wise, individuals with originally equal talent or apparent apti- 
tude, if subjected to varying amounts of training, will later 
appear to have considerably divergent degrees of talent, ac- 
cording to scores on aptitude tests. Failure to control train- 
ing as a factor in aptitude scores will then obviously affect 
correlations for validity, making the coefficients lower than 
would otherwise be obtained. 

Another result of the failure to control training in aptitude 
testing has been the accumulation of data on group differences, 
such as those of race and sex, which are in many instances mere 
artifacts of environmental or training differences. Such data 
were obtained by the writer in a study of musical aptitude, 
using the K-D Music Tests.2 In that study sex differences 
and economic group differences in scores on the test battery 
as a whole were shown to be dependent upon differences in 
training. In view of the results, the writer held that the old 
concept of ‘‘fixed innate potentiality’’ in aptitude testing was 
an unrealistic one, and should be replaced by the concept of 
individual differences modified by environment. 

Two ways were suggested in which norms for aptitude tests 
could take account of training, and so give the untrained full 
credit for their talent, while not exaggerating the talent of 
those who had received the benefit of much training; (a) using 
separate norms for trained and untrained subjects, though not 
necessarily considering the two sets of norms as equivalent 
indices of talent; (b) eliminating from the test battery those 
parts which were unduly susceptible to the influence of train- 
ing, provided, of course, that a satisfactory criterion of re- 
liability and validity could be maintained. The latter scores 


2Gilbert, G. M. Sex differences in musical aptitude and training. 
J. Gen, Psychol. (in press). 
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TABLE 1 
College Norms for the K-D Music Tests 


(Based on a representative sampling of 1,000 college students 
in 12 colleges in 5 states) 














K-D Percentile K-D Percentile “60”? —_ Percent 
score Untr. Trained| "°T Untr. Trained] Se (all) 
ON idee 1 Biches BP nici 15 
eased 1 BD iia ___ceesew! 17 
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BI iiss 2 ye iP ecciaai 25 
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eRe ee 2 eS eerie 53 
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DO eis ae 5 ae eee 77 
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| ee DO iment 12 ener a 92 
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TABLE 1—(Continued) 

















K-D Percentile K-D Percentile Q** Sees 
score tntr. Trained| ST’  Untr. Trained || score (all) 
,- ane __ Sea 2. oe )68=6hw6F 99 144 ........ 97 
SOD ds... yankee ee 0 Re sagas reine 99 SOD ne 98 
230 ............ WOO ines ee Re Canon 99 UD nicsnen 98 
UR inca B08 ae eee 99 | mee 99 
c C—O 100 - 99 
ae See eee 100 (ees 99 
 . BP nnsien a ee Seer 100 BU serine 99 
235 ........... 100 en” 4. eee 100 ee 99 
BI iacssinien 100 
EP ics 100 
154 - 100 
(Seen 100 











K-D Score, complete battery of 10 sets, has possible total of 275 points; 
min. (guess) = 137. 

**C’’ Seore, composite of 6 tests, possible maximum is 175 points; min. 
(guess) = 87. 

Table indicates complete range of scores obtained. 

The ‘‘trained’’ are those who have had at least 4 yr. of private music 
lessons; (av. 34 yrs.). 


would serve as a fairer basis of comparison for all subjects 
than those obtained with a heavy loading of a ‘‘practice’’ 
factor. 

As a practical first step in testing the applicability of these 
suggestions through independent research, we are submitting 
herewith a set of norms for college students on the K-D Music 
Tests, based on the 1,000 cases used in our experiment. These 
norms embody both of the suggestions made above, and should 
be useful not only in the practical use of the test in question, 
but in checking the importance of training as a factor in apti- 
tude testing, especially in correlations for validity or with 
other independent variables. Similar norms would be valu- 
able on the grammar school and high school levels, but are 
not available to the writer at the present time. 

The K-D battery consists of the following 10 sub-tests : 
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Tonal Memory Rhythm Discrimination 
Quality Discrimination Pitch Discrimination 
Intensity Discrimination *Melodie Taste 
*Tonal Movement *Pitch Imagery 
Time Discrimination *Rhythm Imagery 





These marked with an asterisk (*) are the four tests agreed 
to be most susceptible to the influence of training, and, as our 
study showed, when these tests are eliminated, sex differences 
(which correspond to training differences) are also eliminated. 
The remaining tests are therefore used for a Composite (‘‘C’’) 
score, to minimize the influence of training. Fortunately, 
these 6 tests correspond exactly to the 6 tests in the revised 
Seashore battery, so that this Composite score may also serve 
as a shorter (though probably less reliable) alternative battery 
to the Seashore, or as a rough equivalent score where it proves 
desirable to compare results obtained with the K-D tests to 
results obtained by other investigators with the revised Sea- 
shore test. 

The chief purpose in presenting these norms is, however, to 
make available one tool for testing the hypothesis that for pur- 
poses of practical psychometrics there is no such thing as 
‘‘aptitude’’ independent of training, and that this factor must 
be taken account of either in the construction of the test, or 
in setting up the norms, or both. When this practice becomes 
widespread, we feel certain that test validation will be greatly 
facilitated, and much of the present data involving correla- 
tions and group differences will become obsolete. 


THE ABILITY OF UNTRAINED SUBJECTS TO 
JUDGE INTELLIGENCE AND AGE FROM 
HANDWRITING SAMPLES 


WARREN C. MIDDLETON 
DePauw University 


INTRODUCTION 


HE purpose of this study is to discover to what extent 
fs untrained subjects can judge intelligence and age from 
handwriting samples. Practically all graphologists are 
of the opinion that it is possible for an experienced judge to 
determine from script both the intelligence and the age of an 
individual. Crépieux-Jamin (2) has, as a matter of fact, 
described in some detail the handwriting ‘‘signs’’ which pur- 
port to indicate inferior and superior mentality. While some 
few investigations have been made on the judgment of intelli- 
gence from script, there has been, as far as the author has 
any knowledge, practically no experimental work reported on 
the ability of untrained subjects (or of professional grapholo- 
gists) to estimate age from handwriting. 

A few investigators have compared writing quality, as deter- 
mined by teacher ratings or by standardized scales, with intel- 
ligence test scores. The reported results are somewhat con- 
fusing. For example, Thorndike (9) got practically zero 
agreement with adult subjects, while Gesell (4) and Starch 
(8) got positive correlations (.30 or better) with children. 
Allport and Vernon (1) conclude that the excellence of chi- 
rography seems to depend on intellectual ability only (if at 
all) in immature subjects. 

Probably the most thorough study of the use of handwriting 
in estimating intelligence has been reported by Omwake (7). 
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In this investigation, quality of writing, as determined by 
Thorndike’s scale, correlated only .047 with Army Alpha 
scores. Von Foerster (3) had the script samples of 70 stu- 
dents rated by 5 judges with reference to their aptitude. 
Correlations between the ratings and tests of intelligence and 
special ability were .29 or less. 


METHOD 


The present study consists of two separate experiments. 
The first one has to do with judgment of intelligence from 
script ; the second one is an attempt to discover to what extent 
age can be determined from handwriting samples. The two 
investigations were performed three weeks apart; the subjects 
and the judges were different for each experiment. 

In the first experiment (handwriting-intelligence) hand- 
writing samples were secured from 20 DePauw University 
students' (10 men and 10 women), selected at random from all 
the Sophomore students enrolled. The criterion used was the 
decile classification derived from the American Council on 
Education Psychological Examination (1938 Edition). One 
man and one woman were selected from each decile; this pro- 
vided ten different degrees of intelligence measurement, from 
very inferior to very superior. Ninety-eight students (57 men 
and 41 women) judged the handwriting samples of these sub- 
jects. The judges made their ratings from script on a 10-point 
scale. Each person was requested to indicate both his name 
and sex on his mimeographed rating sheet. Each script 
sample was numbered ; the rating scale numbers from 1 to 10 
were printed after each sample number. The ratings were 
made by drawing a diagonal line through the number which 
indicated the intelligence decile into which the judge believed 
the person whose handwriting he observed should be classified. 
No free comments were elicited. 

1 The author would probably be criticized by many graphologists on the 
ground that his group of subjects is too homogeneous. Indeed, something 
may be said for this criticism, although it is difficult, if not impossible, 
to say precisely what effect such homogeneity actually might have. 
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In the second experiment (handwriting-age) handwriting 
samples were secured from 14 individuals (7 men and 7 
women) who ranged in age from 15 to 75 years. None of these 
subjects was enrolled in college; the two 15-year-old subjects 
were attending high school. The subjects were equated as 
closely as possible for educational status, though not for 
intelligence. All of the women subjects (with the exception 
of the 15-year-old girl) were housewives; the men (with the 
exception of the 15-year-old boy and the 75-year-old man) were 
engaged in various proprietorial occupations. One man and 
one woman were selected from the 15th, 25th, 35th, 45th, 55th, 
65th, and 75th years; this provided seven different degrees of 
age. One hundred and thirteen students (52 men and 61 
women) judged the handwriting samples of these subjects. 

The judges made their ratings from script on a 7-point scale. 
Each script sample was numbered on the mimeographed rating 
sheet; the rating scale numbers 15-25-35-45-55-65-75 were 
printed after each sample number. The ratings were made (as 
in the first experiment) by drawing a diagonal line through 
the number which indicated the age which the judge believed 
the person was whose handwriting he observed. 

In both experiments each subject was asked to write from 
dictation on a 4x 6-inch card the sentence: ‘‘The dog jumps 
quickly over the fence after the lazy brown fox.’”* All subjects 
used the same pen. They were told to write in their natural 
style and at their normal speed. Despite the instructions, how- 
ever, it can hardly be supposed that some of them did not take 
more than ordinary care with their writing. 

In both experiments the script samples were thrown by an 
opaque projector on a screen one at a time before the two 
groups of student judges. These judges were all untrained; 
none had ever made a serious study of handwriting. Each 
subject’s handwriting was presented twice; this made it pos- 


2 This sentence was used by Kinder (5) in a study of the ability to 
judge sex from handwriting. The author (6) has also used it in a study 
of the ability to judge dominance from script. It will be noted that the 
sentence contains all the letters of the alphabet. 
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sible to determine the reliabilities of the judgments. The sec- 
ond presentation came 48 hours after the first, and in a differ- 
ent order. The two orders of presentation were determined by 
chance selection. It may be doubted that memory played any 
important role in identity of judgment. 


TABLE 1 


Means and SD’s of Judges’ Ratings of Intelligence from the 
Handwriting Samples of Each of Twenty Subjects 





























ike b. = Intelli- | Ratings by judges (N = 98) 
ubject* xb ‘ gence 

Present® | deciles Mn sD 

FR Ww 1&10 9 6.31 2.07 
LI M 2& 5 5 4.58 2.09 
ID Ww 3&15 8 6.94 1.56 
BE Ww 4& 2 3 6.53 1.74 
MA M 5 &14 10 5.87 1.84 
PA M 6& 7 4 4.67 2.10 
EM WwW 7&17 2 5.74 2.20 
HO Ww 8&11 6 5.51 1.62 
SM M 9& 1 1 5.44 2.27 
HA Ww 10 & 13 7 6.18 2.22 
WA M 11 & 20 9 5.60 2.40 
SE M 12& 4 3 5.45 2.13 
MC M 13 & 18 6 5.56 2.06 
ME M 14& 8 8 5.11 1.74 
GR Ww 15 &16 5 6.10 2.08 
DR Ww 16& 3 10 6.21 1.94 
DI M 17 & 19 2 6.41 2.15 
JE Ww 18 & 12 4 6.19 2.30 
s0 M 19& 6 7 6.43 2.34 
TH WwW 20& 9 1 6.65 1.91 








* Subjects are designated by using the initial letters of their first and 
last names. 

> Women subjects are referred to by the letter W; men subjects by the 
letter M. 

¢ The numbers indicate the two orders of presentation of handwriting 
samples. For example, Subject FR’s handwriting was presented first in 
the first series and tenth in the second series (48 hours later). 

4 The respective intelligence deciles are derived from the scores made on 
the American Council on Education Psychological Examination (1938 
Edition). 
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RESULTS AND DISCUSSION 
A. Handwriting and Intelligence 
The means and SD’s of the 98 judges’ ratings of intelligence 
from the handwriting samples of the 20 subjects are shown 
in Table 1. The means of the judges range from 4.58 (Sub- 
TABLE 2 


Reliability Coefficients of Judges’ Ratings of Intelligence from the 
Handwriting Samples of Twenty Subjects 








Subject noes re PE, Corrected ra 
FR 1&10 24 .07 39 
LI 2& 5 .28 .07 43 
ID 3&15 53 .06 .69 
BE 4& 2 31 07 A7 

MA 5&14 36 .06 53 
PA 6& 7 .26 .07 Al 
EM 7&17 21 .07 35 
HO 8&11 .69 .04 82 
SM 9& 1 .03 .07 .06 
HA 10 & 13 24 .07 .39 
WA 11 & 20 32 .07 48 
SE 12& 4 11 .07 19 
MC 13 &18 .26 .07 41 
ME 14& 8 61 .05 .76 
GR 15 & 16 -27 .07 42 
DR 16& 3 .20 .07 34 
DI 17 &19 22 .07 .36 
JE 18 & 12 .08 .07 15 
so 19& 6 14 .07 .25 
TH 20& 9 16 .07 27 
Total 36 01 53 

















* Subjects are designated by using the initial letters of their first and 
last names. 

> The two orders in which each subject’s handwriting sample was pre- 
sented to the judges. 

¢ Correlation of the judges’ ratings for the first presentation of hand- 
writing with the second presentation. 

4 Corrected by the Spearman-Brown formula. 
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ject LI, 5th decile) to 6.94 (Subject ID, 8th decile). The 
SD’s range from 1.56 (Subject ID, 8th decile) to 2.40 (Subject 
WA, 9th decile). 

The raw and the corrected reliability coefficients of the 
judges’ ratings of intelligence from the handwriting samples 
of each of the 20 subjects are shown in Table 2. It will be 
noted that these corrected coefficients (Spearman-Brown) 
range from .06 (Subject SM, lst decile) to .82 (Subject HO, 
6th decile). Three coefficients are less than .20, two are in the 
twenties, five in the thirties, six in the forties, one in the fifties, 
one in the sixties, one in the seventies, and one in the eighties. 
The ratings on only four of the subjects (those exceeding a 
correlation of .50) show anything like true consistency. The 
corrected reliability coefficients of the judges’ ratings for the 
total group of subjects is .53. Although not shown in Table 2, 
the corrected r for the men judges is .48 ; for the women judges, 
.64. Thus, the women show a higher consistency of judgment. 

The Product-moment correlation between the judges’ ratings 
of intelligence from the handwriting samples of the 20 subjects 
and the intelligence deciles of the subjects is shown in Table 3. 


TABLE 3 


Correlation between Judges’ Ratings of Intelligence from the Hand- 
writing Samples of Twenty Subjects and the Intelligence 
Deciles of the Subjects 











Judges N Tr PE, 
I sittienctcsincinee 57 001 01 
I since 41 041 02 

ES, 98 .018 01 











All of the coefficients are practically zero. The r for the judges 
is .018 + .01 (.001+ .01 for the men; .041 + .02 for the 
women). From this study it must be concluded that untrained 
student judges cannot estimate intelligence from handwriting 
samples. 
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B. Handwriting and Age 


The means and SD’s of the 113 judges’ ratings of age from 
the handwriting samples of each of 14 subjects are shown in 
Table 4. The means of the judges range from 31.06 (Subject 


TABLE 4 


Means and SD’s of Judges’ Ratings of Age from the Handwriting 
Samples of Each of Fourteen Subjects 











a a Ratings by judges (N= 113) 
Subjects | Sex> Pusemtn- Age 
tione Mn SD 
CH Ww 1& 5 55 44.24 24.96 
8s M 2 & 11 25 31.06 19.46 
TC M 3& 2 65 53.89 22.12 
MC Ww 4& 8 15 32.66 | 18.54 
EH M 5 & 10 55 60.09 22.51 
IF Ww 6& 4 35 32.66 17.26 
ES Ww 7 &14 25 39.86 24.45 
EP M 8& 1 75 50.44 27.46 
RH Ww 9.& 13 45 46.68 23.62 
BB M 10& 6 15 45.04 21.74 
LC Ww ll& 9 65 34.78 24.99 
LH M 12& 3 45 46.86 25.59 
NA Ww 13 & 12 75 50.66 27.72 
CF M 14& 7 35 38.85 22.75 




















«Subjects are designated by using the initial letters of their first and 
last names. 


> Women subjects are referred to by the letter W; men subjects, by 
the letter M. 

¢ The numbers indicate the two orders of presentation of handwriting 
samples. For example, Subject CH’s handwriting was presented first in 
the first series and fifth in the second series (48 hours later). 


SS, 25 years old) to 60.09 (Subject EH, 55 years old). 
Table 4 shows that there is a definite tendency on the part of 
the judges to avoid using the extremes of the rating scale; the 
middle values are used with much consistency. The SD’s 
range from 17.26 (Subject JF, 25 years old) to 27.72 (Subject 
NA, 75 years old). 
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The raw and the corrected reliability coefficients of the 
judges’ ratings of age from handwriting samples of each of 
the 14 subjects are shown in Table 5. These corrected coeffi- 


TABLE 5 


Reliability Coefficients of Judges’ Ratings of Age from the Handwriting 
Samples of Each of Fourteen Subjects 








a | pon, re PE, | Corrected r4 
CH 1& 5 29 06 45 
ss 2&11 06 06 Bel 
TC 3& 2 23 06 37 
MC 4& 8 tl 06 19 
EH 5 & 10 17 06 30 
IF 6& 4 08 06 15 
ES 7 &14 02 .06 03 
EP 8& 1 22 06 36 
RH 9&13 12 06 20 
BB 10& 6 14 06 25 
LC & 9 16 06 28 
LH 12& 3 25 06 40 
NA 13 & 12 09 06 17 
CF 14& 7 uu 06 19 
Total 28 02 43 

















® Subjects are designated by using the initial letters of their first and 
last names. 

> The two orders in which each subject’s handwriting sample was pre- 
sented to the judges. 

¢ Correlation of the judges’ ratings for the first presentation of hand- 
writing with the second presentation. 

4 Corrected by the Spearman-Brown formula. 


cients range from .03 (Subject ES, 25 years old) to .45 (Sub- 
ject CH, 55 years old). Six corrected reliability coefficients 
are below .20, three are in the twenties, three in the thirties, 
and two in the forties. The judgments are not consistent; in 
fact, the ratings on only two of the subjects (those having a 
correlation of .40 or above) show anything like reliability. 
The corrected reliability coefficients of the judges’ ratings for 
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the total group of subjects is .43. Although not shown in 
Table 5, the corrected r for the men judges is .45; for the 
women judges, .42. Thus, there is practically no sex difference 
in consistency of judgment. 

The Product-moment correlation between the judges’ ratings 
of age from the handwriting samples of the 14 subjects and the 
actual ages of the subjects is shown in Table 6. The correlation 


TABLE 6 


Correlation between Judges’ Ratings of Age from the Handwriting 
Samples of Fourteen Subjects and the Actual 
Age of the Subjects 








Judges N r PE, 
een 52 .29 02 
Women ................ 61 24 02 

OE nickincnin’ 113 25 01 














for the 113 judges is .25+ .01 (.29+ .02 for the men; 
.24 + .02 for the women). The men judges are slightly more 
accurate than the women. From the present study it must be 
concluded that untrained student judges have some slight 
ability to estimate age from handwriting samples. 


SUMMARY OF RESULTS 


The following conclusions are of a tentative nature and 
apply only to the present study : 

1. The reliability coefficient of 98 untrained student judges’ 
ratings of intelligence from the handwriting samples of 20 
student subjects is, in general, rather low. The judgments do 
not indicate marked consistency. 

2. The coefficient of correlation between the judges’ ratings 
of intelligence from the handwriting samples of the 20 subjects 
and the intelligence deciles of the subjects is practically zero. 

3. The reliability coefficient of 113 untrained student judges’ 
ratings of age from the handwriting samples of 14 subjects is 
not satisfactory. 
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4. The coefficient of correlation between the judges’ ratings 
of age from the handwriting samples of the 14 subjects and the 
actual ages of the subjects is .25. Thus, the judges have some 
slight ability to estimate age from handwriting samples. 

5. In judging both intelligence and age from script, sex dif- 
ferences are negligible with respect to accuracy in making 
estimates. 
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A STUDY OF THE VARIABILITY OF 
1Q’S IN RETESTS 


FRANCES E. LOWELL 
Clinical Psychologist, Board of Education, Cleveland 
PURPOSE OF THIS STUDY 


HIS study was started about five years ago after the 
writer had been examining Cleveland Public School 
children over a period of ten years. Results seemed to 

indicate that while the intelligence quotients of some children 
remained constant on retest, many had lost numerous points 
and others had made appreciable gains. This was apparently 
corroborated by the school records, especially in cases where 
the IQ’s decreased. These showed a yearly decrease in both 
quantity and quality of achievement. In order to ascertain to 
what extent these changes actually occurred and when they 
occurred, this study was made. 


ORGANIZATION OF CLEVELAND SCHOOLS 


Before going into details of the research, it might be well to 
state briefly the organization of the Cleveland Public Schools, 
for this explains why there are so many requests for tests and 
retests of school children. First, there are classes for children 
from seven to twelve years of age whose IQ’s are below 50, 
called Classes for Low Mentals. These children would be 
excluded from all other types of schools. Higher than these 
are the Special Classes for Mental Defectives, having IQ’s 
from 50-69 inclusive; then come ‘‘Z’’ or borderline children 
with IQ’s from 70—85 ; above these ‘‘Low Y’’ groups with IQ’s 
from 86—94 ; then normal or ‘‘ High Y”’ children with IQ’s from 
95-109 ; *‘X”’ children with IQ’s from 110-124, and highest 
of all the Major Work Classes that have IQ’s over 125. 
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GROUPING BEGINS IN KINDERGARTEN 


All children enter Kindergarten as ‘‘High Y’’s or normal 
individuals. A few, whose birthdays come within a month 
after the regular deadline for Kindergarten entrance, are 
tested for underage entrance and if they have IQ’s of at least 
120 and show good physical, emotional and social adjustment, 
they are permitted to enter Kindergarten underage. As a 
school term progresses, some Kindergarten children prove to 
be misfits. They cannot follow directions given to the group 
or they are emotionally unstable and demand a great deal of 
individual attention. These are offered for study and test, 
and the few found to be too immature for Kindergarten work 
are temporarily excluded. Others, not sufficiently immature 
to warrant exclusion, progress very slowly, have difficulty in 
developing good work habits, are poor in motor coordination 
and slow in comprehension. These gradually sort themselves 
into a little group which needs at least three terms of Kinder- 
garten. The remaining children in Kindergarten show vary- 
ing degrees of ability. By the end of a year the first sorting 
has been made and then those needing three terms in Kinder- 
garten, as well as the group showing superior ability, are 
tested to corroborate the teacher’s judgment. 


VARIATIONS IN CURRICULA FOR GROUPS 


The curricula for the groups vary in difficulty and in amount 
of subject matter covered. Thus in reading, ‘‘X’’ children 
have 22 levels to complete before they finish the sixth semester 
or 3A; ‘‘High Y”’ children have to complete 20 levels in the 
same period; ‘‘Low Y’’ and ‘‘Z’’ groups take two, three or 
more additional semesters to complete 3A, with ‘‘Low Y’’s 
having 17 and ‘‘Z’’s only 12 levels. In grades 4 through 6, 
‘‘Low Y’’s and ‘‘Z’’s may take extra time, thus making the 
former enter Junior High at thirteen or thirteen and a half 
years of age, and the latter at fourteen years instead of the 
twelve years or less of the ‘‘X’’s and ‘‘High Y’’s. 
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TYPES OF CASES TESTED 


Since teachers are often hesitant about placing children in 
lower groups on their judgment only, a Binet as an objective 
measure is requested. When children increase in age but fail 
to keep the expected pace in achievement, tests or retests are 
requested either by teachers or parents. Dropping IQ’s are 
detected in school work before they show up in the Binet. 
Again, children offered for placement in any special type of 
class must meet certain IQ standards, so tests are made for 
placements in Major Work Classes, Classes for Sight Saving, 
Deaf School, School for Crippled Children, Classes for Men- 
tally Defective and Low Mentals, and classes having hearing 
aids, ete. Behavior problems of all varieties are studied and 
tested by the Clinic. Children returning from Correctional 
Institutions are retested for school placement. Occasionally, 
in schools where some educational experiment is in progress, 
whole rooms of children are given the Binet test. This has 
been done in a number of first grades. 


PERSONNEL DOING TESTS 


The nine members of the Clinic making these tests are all 
highly qualified psychologists, four of whom have Ph.D. and 
the rest M.A. degrees. All are members of A.A.A.P. Each 
psychologist keeps, in so far as numbers, etc., permit, the same 
schools year after year, retesting and following up cases 
studied over long periods. However, since many children 
move from one school district to another, there are large num- 
bers of cases tested by several different psychologists. All 
tests are filed at the Clinic, so the results used in this research 
include tests made by all members of the group. 


SELECTION OF CASES FOR THIS STUDY 


In getting the data for this study the writer took from the 
files of the Psychological Clinic the first 1000 cases that had 
two tests only and these were designated the ‘‘2-Test Series”’ ; 
a second 1000 cases that had three tests only, or the ‘‘3-Test 
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Series’’; and a third 1000 cases that had four tests only, or 
the ‘‘4-Test Series.’’ A few cases were discarded when some 
known cause such as foreignness, deafness, etc., would make 
the test unreliable, as well as a few cases where tests were made 
less than a year apart, which again might invalidate results. 
Terman’s 1916 revision of the Binet has been used in all tests. 


DATA 


A. Five general comparisons were made between the various 
tests of each series. 

1. A comparison of the frequency distribution of IQ’s. The 
cases were distributed over a wide range, from IQ group 30-34 
through IQ group 155-159. Since complete tables are too 
voluminous for this report, only the modes and the IQ groups 
in which they fall will be given for the tests of each series. 


2-Test Series 8-Test Series 4-Test Series 
Test Mode IQ Group Mode IQ Group Mode IQ Group 


1 179 cases 90-94 165 cases 90-94 154 cases 90-94 
2 185 cases 75-79 18lcases 75-79 220 cases 75-79 


3 183 cases 70-74 228 cases 70-74 

4 214 cases 60-64 
Combining the 3000 first tests,the mode falls in 90-94 group 

sé sé 3000 second tests, sé ce “é sé 75-79 sé 

ce ce 2000 third tests, «sé ce ce sé 70-74 sé 


Summary: There seems to be a definite lowering of the IQ 
group with each successive retest, and this holds true regard- 
less of the large numbers of cases tested. 

2. A comparison of the frequency distribution of cases at 
each C.A. on first test. The range of C.A. at Test 1 was from 
year 2 through year 13, but 94 per cent of the 3000 cases had 
their first test at year 5, 6, 7 and 8, as shown below. 


30 per cent of the 3000 cases had first test at year 5 


48 “é “ec “eé “é “e “é “ce “ce sé “ec “ec 6 
11 sé ce ‘é “é “é “ec “é “é ‘ec “é ‘é 7 
‘ 


5 sé sé “ec “é ce “é sé sé ae ae sé ~ 
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This is to be expected since children usually find their respec- 
tive ability groups during the first three years of school atten- 
dance. Most of the cases tested for the first time at the higher 
chronological ages are children who have just entered the 
public schools from out of town or from neighboring parochial 
schools, and who do not fit into their respective age groups. 

3. A comparison of Median IQ’s for each C.A. Only those 
ages having 40 or more cases in each of the test series are given 
in the summary below. See Table 1. 


TABLE 1 
A Comparison of Median IQ’s for Each Chronological Age 














2-Test Series 3-Test Series 4-Test Series 
C.A. Med. Med. Med. 
(Year) No. IQ No. IQ No. IQ 
cases cases ——————_ cases 
T1 T2 Ti Se T1 T2 T3 T4 





5 217 94 86 279. 88 81 76 394 81 77 71 66 
6 469 91 838 502 88 78 73 468 80 75 69 64 
7 120 87 79 114 80 73 67 91 74 69 64 60 
8 59 87 78 50 77 67 64.5 Insufficient cases 





Summary: Here, also, we find the same steady decrease in 
the median IQ for each chronological age, on successive retests 
in each series, as we found in the IQ groups. 


TABLE 2 
A Comparison of Median IQ’s for Each Mental Age 











2-Test Series 3-Test Series 4-Test Series 
M.A. Med. Med. Med. 
(Year) No. IQ No. IQ No. IQ 
cases —————- cases ———————_ cases 
Ti TS Ti Ta Fs 7. we ze 34 





3 Insufficient cases 48 65 67 69 130 62 66 65 61 
4 109 79 78 228 77 75. 70 380 76 75 68 63 
5 386 89.5 82 421 88 78 73 364 86 77 70 66 
6 303 97 85 2231 95 81 76 101 96 80 75 71 
7 88 88 81 45 90 82 75 Insufficient cases 

8 52 86 76.5 Insufficient cases sd si 
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4. A comparison of Median IQ’s for each M.A. As in the 
preceding table, only those ages having 40 or more cases per 
- test are included. See Table 2. 

Summary: The median IQ for each mental age except Year 3 
shows a definite decrease on each retest in each series. Cases 
having a mental age of three on first test are too few to con- 
sider in the 2-Test Series and too few to be very reliable in the 
3-Test Series. However, there seems to be a tendency for them 
to increase on first retest and then, as shown in the 4-Test 
Series, to decrease on further retests after the fashion of other 
ages. This is probably due to the inaccuracy of an IQ when 
the mental age is as low as three years. 

5. A comparison of the sex distribution for each series. 











Boys Girls 
2-Test Series 579 421 
3-Test Series 592 408 
4-Test Series 615 385 
1786 1214 


Summary: Approximately 60 per cent of the 3000 cases are 
boys and 40 per cent girls. This corresponds to the distribu- 
tion found in our Clinic’s annual report. 

B. The specific problem of the variability of the IQ was next 
considered. This was divided into two phases: one, a com- 
parison of IQ changes between the first and last tests of each 
series; and two, a comparison of IQ variations from one test 
to another in the 3 and 4-Test Series. We shall first present 
the data for phase one. 

1. Variations in IQ’s between the first and last tests of each 
series. A survey of the data showed that variations in IQ 
ranged from 0 to a 53 point decrease between first and last 
tests. It seemed advisable to separate the cases showing no 
variation at all from the others, and these were designated 
Group I. 

The probable error of the Binet IQ is usually given as plus 
or minus 5.8. Therefore those cases varying in IQ one to six 
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points inclusive, were kept together as Group II, since such 
changes might be merely chance variations and not real changes 
in the IQ. 

The remaining cases varying 7 or more points in IQ were 
placed in Group III. 

The 3000 cases were thus divided according to variations 
between first and last tests as follows: 


Group I— 93casesor 3%—having 0 variation 
Group IIl— 77lcasesor26%— ‘‘ 1-6pts.(incl.) ‘“ 
Group ITI—2136 cases or 71%— ‘‘ 7ormore pts. ‘‘ 


Each group was then analyzed to see: 

a. If IQ’s in the normal or average range tended to remain 
unchanged more or less often than the very high or the very 
low IQ. 

b. The effect of C.A. at first test on the constancy of the IQ. 

e. The influence of length of interval elapsing between tests, 
on the constancy of the IQ. 

d. How IQ’s of beys and girls compare on first and last test. 

These points will be discussed in order and results for each 
group summarized. 

a. Effect of IQ ranges on constancy of IQ. In Group I 
(0 var.) about the same percentage of cases remained constant 
on the last test in the high as in the low IQ ranges. Thus 


IQ Range No. Cases % Total Cases Tl 
50- 69 12 3.4% 350 
70— 89 49 3.2% 1510 
90-109 28 2.8% 1017 

110-129 + 3.9% 102 


In Group II (1-6 pts. var.) the cases range all the way from 
the lowest through the highest IQ groups. Of the 771 cases in 
Group II, 238 cases increased 1-6 points, inclusive, with the 
median increases as follows: 


2-Test Series 121 cases Med. Increase 2.6 points 
3-Test “é 64 sé “ce ce a7 “ce 
4-Test ce 53 cc “é <é 21 sé 
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533 of the 771 cases showed a decrease of 1-6 points, with 
median decreases as follows: 


2-Test Series 263 cases Med. Decrease 3.3 points 
3-Test ‘‘ 156 ‘‘ - = — 
4-Test “é 114 sé sé “é 3.8 “c< 


When there are 100 or more cases at any IQ group; as in the 
groups from 70-74 to 95-99, the percentage of distribution is 
approximately the same. This is true also at the very high 
and very low ranges but because numbers are so limited in 
these groups the percentages run large, as at IQ group 35-39 
the percentage is 50 and at 125-129 it is 538, whereas with 
greater frequencies, they range from 10 per cent to 33 per cent. 

In Group III (7 or more pts. var.) 159 cases showed an 
increase of 7 or more points and the median increases were as 
follows: 


2-Test Series 77 cases Med. Increase 10.7 points 
3-Test sé 46 “é “é sé 10.5 sc 
4-Test sé 36 sc “< sé 10.5 “é 


1077 cases showed a decrease of 7 or more points with median 
decreases as follows: 


2-Test Series 487 cases Med. Decrease 13.4 points 
3-Test ‘ 708 ‘* m ” 15.1. °° 
4-Test ‘‘ 782 ‘* , se 74. = 


The IQ range, high or low, showed little effect when the vari- 
ation is 7 or more points, for again the distribution of cases is 
proportionate to the total distribution at the various ranges. 

b. Effect of C.A. at first test on Constancy of IQ. For 
Group I (0 var.) the frequency distribution of the 93 cases is 
from 1 to 40 cases at chronological ages 4 through 13, except 
for year 12 which had no eases. Only three ages had enough 
cases to consider : 


Year5 29cases or 3.2 per cent 
. 4 or 2.7 per cent 
rie aa or 3.4 per cent 
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Where there is no variation in IQ between the first and last 
tests, chronological age at first test shows little effect on the 
constancy of the IQ. 

For Group II (1-6 pts. var.) results showed that cases were 
distributed from Year 3 through Year 13. Where there are 
sufficient cases to be reliable, as at Years 5 to 9, the percentage 
varying 1-6 points at each age remains uniform, so again C.A. 
at first test is not important for IQ constancy. 

For Group III (7 or more pts. var.) the same statement 
holds true. With sufficient numbers to be reliable, the distri- 
bution of cases varying 7 or more points, is similar at each 
chronological age. Thus 71 per cent of the total cases having 
their first test at Year 5 varied 7 or more points, and 69 per 
cent having the first test at Year 8. Evidently a first test made 
on a very young child has no more likelihood of remaining 
unchanged than a first test made on an older one. 

ce. Influence of interval elapsing between tests on constancy 
of the IQ. In Group I (0 var.) the 93 cases were distributed 
fairly evenly over the various intervals, thus 13 per cent of the 
eases had an interval of one year between tests and 12 per cent 
had an interval of seven years. Each interval had an IQ range 
from subnormal to above average, showing that the length of 
interval elapsing between tests did not prevent IQ’s from 
remaining the same. 

In Group II (1-6 pts. var.) the percentages of the 771 cases 
were about equal for the different intervals and at the various 
C.A.s. Thirteen per cent of the cases had an interval of one 
year between first and last tests, 19 per cent had an interval 
of three years and 10 per cent an interval of six years. 

In Group III (7 or more pts. var.) the distribution and inter- 
vals are given below. 


Interval No. Cases . * Setal 
3 years 203 10 
6. 265 12 
aes 305 14 
ee 390 18 
- 308 14 
- oo 237 11 
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Here, too, all intervals showed similar percentages of cases at 
each age level, so retests made after an interval of eight years 
are no more likely to vary in IQ than those made after a lapse 
of three years. 

d. How IQ’s of boys and girls compare on first and last tests. 
In the 3- and 4-Test Series there were 1207 boys and 793 girls. 
The following table shows the distribution of each in the three 
groups: 











Boys Girls 
Per Cent Per Cent 
we. of 107 we of 798 
Group I (0 var.) 28 2 | 
Group II (1-6 pts. 71 6 increased 1-6 45 6 increased 1-6 
var.) pts. pts. 
162 13 decreased 1-6 99 12 decreased 1-6 
pts. pts. 
GroupIII (7ormore 62 65 increased 7 or 22 3 increased 7 or 
pts.) more pta. more pts. 
884 74 decreased 7 or 613 77 decreased 7 or 
more pts. more pts. 





Boys and girls show practically no difference in the way their 
IQ’s vary between first and third, and first and fourth tests, 
respectively. 

2. Variations in IQ between successive tests in the 3- and 
4-Test Series. This second phase of the problem of variability 
of the IQ follows the IQ from Test 1 through Test 3 or Test 4, 
according to its respective series, and compares its variations. 
There has been a feeling among psychologists that I1Q’s on 
second test too often increased unduly and then dropped back 
to the level of the first test or below, on the third test. To 
ascertain what actually happened, an analysis of the cases 
increasing 7 or more points on the second test was made. A 
brief summary of the findings are given below. 

In the 3-Test Series, 195 cases increased and 772 cases de- 
creased on Test 2. Of the 195 cases that increased, 64 or 32.8 
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per cent increased 7 or more points. Following these 64 cases 
through the third test, we find only 4.7 per cent increased 
another 7 or more points while 50 per cent decreased that 
amount on Test 3. 

In the 4-Test Series, 278 cases increased and 673 eases de- 
creased on Test 2. Of the 278 cases that increased, 99 or 35.6 
per cent increased 7 or more points. Following these 99 cases 
through the third test, only 1 case or 1 per cent increased 
another 7 or more points while 59 per cent decreased that 
amount on Test 3. Again, following the one case, that in- 
creased 7 or more points on Test 3 on through Test 4, we find 
it decreased 4 points on Test 4. 

A summary of the data on variability of IQ’s on successive 
tests shows that: 


(a) 3 times as many cases decrease as increase on Test 2 
(b) 4 “<é sé sé sé sé “é 4é << “é 3 
(ce) 4 sé sé ce sé “é << é sé “é 4 
(d) Those cases that increase 7 or more points on Test 2 

decrease 5 times as often as they increase on Test 3 


e. The last part of this study dealt with the possibility of 
C.A. on Test 2 as a factor in determining the direction of the 
variability of the IQ on second test. All cases kept their group- 
ings of 0 variation, 1-6 points and 7 or more points variation 
for each C.A. Results showed that when there are enough 
eases having a second test at any given C.A., certain trends are 
evident. These are summarized below. 

1. From C.A. 7 through C.A. 12, the percentage of cases 
increasing in IQ on Test 2 becomes definitely smaller with each 
additional year of C.A. Thus 9 per cent of the cases having 
the second test at Year 7 increased 7 or more points but at 
Year 11 only 1 per cent increased that much. In other words, 
the older a child is at the time of his second test, the less chance 
there is that his IQ will increase. 

2. Year 6 for a second test is the only age where approxi- 
mately the same percentages of cases increased as decreased 
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and in the same amounts. This may be accounted for by the 
fact that a large percentage of Cleveland children attend 
Kindergarten before entering the first grade. A second test 
at Year 6 would indicate that the first test must have been 
made at the beginning of Kindergarten—age 5—or earlier, 
since only cases having tests a year or more apart have been 
used. A Binet would have been given either because the child 
seemed bright enough to enter Kindergarten underage, or 
because his lack of social adjustment or his mental immaturity 
made him a misfit in the Kindergarten. 

During a year of Kindergarten, color naming and recogni- 
tion, counting objects to 20, drawing squares and diamonds, 
describing pictures, etc., are part of the regular curriculum. 
Hence a child having a fair memory could rate normal on a 
Binet at 6 years, but the score would really be a measure of his 
achievement rather than of his intelligence. 

From the data used in this research, it is impossible to find 
out which children attended Kindergarten. However, a 
further check was made of the 75 cases increasing 7 or more 
points on Test 2 at Year 6, to see what happened to them on 
Test 3. The results follow: 


3 per cent increased 7 or more points on Test 3 
56 sc“ “é decreased ‘‘ “ce cc “cc ¢s sé 3 


However, the fact that so many cases decreased on Test 3 
may or may not be due to the unreliability of the second test 
at Year 6, for it must be remembered that cases given to a 
psychologist for a Binet are after all more or less selected 
eases. Many of these children might drop in IQ merely be- 
cause their rate of mental development had not kept pace with 
their chronological age. Additional data on an unselected 
group would be necessary to make results conclusive on this 
point. 

A child’s intelligence may change in its development just as 
his rate of physical growth varies. Many children seem to 
progress normally during the period of greatest sensory devel- 
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opment. This period corresponds in school life to the primary 
grades where factual material is acquired, and the mechanics 
of reading and the fundamentals of arithmetic learned. Later 
when the higher mental processes such as reasoning, judgment 
and association are needed, and the child is unable to satisfy 
the school requirements, a Binet shows that these processes 
have not yet matured. The following case is just one illustra- 
tion of this. 

Frank T. was one of fifty Kindergarten children tested for 
experimental purposes. At this time his C. A. was 5 years 8 
months and he tested 100 IQ or ‘‘High Y.’’ He finished 
Kindergarten and completed his year in first grade success- 
fully. He began to have difficulty in the second grade and at 
the end of 2B had not completed the work. The next term he 
finished 2B and started 2A. A second Binet was given then, 
and his IQ had dropped to 90, which placed him in a ‘‘ Low Y’’ 
group. He was able to achieve the work of this group fairly 
well for a year, and then he failed again. A third Binet showed 
an IQ of 86, four points below the previous test, but still 
within the ‘‘Low Y’’ grouping. Even this work became gradu- 
ally too difficult and two years later Frank was given a fourth 
test. He had dropped another six points, to 79 IQ, and was 
placed in the ungraded or ‘‘Z’’ group. Here he was assigned 
less work and given more time to accomplish it. He had 
learned abstract arithmetic but found its application to con- 
crete problems very difficult. He could memorize history and 
geography facts but their interpretation and relationships were 
beyond him. 

On the other hand, some children develop slowly at first and 
then suddenly spurt ahead. The school work reflects this in- 
creased rate of progress, too, and again a Binet shows a marked 
development in the child’s intelligence. Consider the case of 
Danny R. 

Danny R. was born January 15, 1929. He entered Kinder- 
garten at the age of five years and was found to be such a 
misfit in Kindergarten that after a few weeks he was offered 
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for a Binet test. The following are records of the four tests 
given before the end of the 6A grade, with the date of test, 
mental age and IQ. 


2-234 M.A. 4yrs.2mo. IQ 82 
5-9-35 MA. 6 “ 2“ IQ 98 
68-37 MA. 9“ 4“ IQ10 
12-340 MA 15 “ 9 “ 19-182 


The first Binet showed such mental immaturity that the 
child was excluded from Kindergarten for a year. The next 
year Danny moved into another school district and it was here 
that the writer made his acquaintance. He had re-entered 
Kindergarten but had seemed so ‘‘queer’’ that the teacher 
asked for a Binet. This time he tested normal, so he was placed 
in the first grade in September in spite of his lack of social 
adjustment. The mother was called in and only then was light 
thrown on his peculiarities. The teachers had complained that 
the boy seemed to live in a little world of his own—day dream- 
ing—they called it. He wasn’t interested in group activities 
and was noticeably poor in his motor co-ordination. He had a 
worried look on his face most of the time and watched the 
clock with undue anxiety. 

The mother explained that while Danny was still a baby his 
father had developed encephalitis and, of course, could no 
longer work. In order for the mother to work they moved to 
the grandparents’ home, where Danny could receive care. 
Unfortunately, Danny’s grandfather was a high-strung, ner- 
vous old gentleman who was much annoyed by the child’s 
noise, and expostulated so violently at times that. Danny be- 
came ‘‘petrified’’ with fear. As a result, he sat on a chair 
for hours at a time, scarcely breathing, lest it disturb 
‘‘grandpa.’’ The grandmother’s chief aim was to keep things 
quiet and peaceful at any cost, so Danny paid the penalty. 
It wasn’t until several years had passed that the mother 
realized her boy was not developing normal habits and inter- 
ests, and when he was excluded from Kindergarten she decided 
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she must take him away from the grandparents’ home. They 
moved into a different neighborhood and the boy once more 
entered Kindergarten. 

The next few years were a period of educational, social and 
emotional growth for the starved child. It is not surprising 
that the child amazed his teachers with his achievement, for a 
new world had opened up before him. He became an inveter- 
ate reader and could solve arithmetic problems far beyond his 
grade level. Physically frail, he has been under a doctor’s 
eare much of the time, and because of his fear complex, he was 
also treated by a psychiatrist. He made friends with boys in 
spite of lack of physical prowess. Recently, because of his 
excellent school work, he was given a fourth Binet and found 
to have an IQ sufficiently high to warrant placement in a 
Major Work Group, where he is now enjoying competition with 
minds as keen as his own. 

The fact that these variations occur, regardless of whether 
the child appears slow or bright at first, seems to indicate that 
certain factors within the child, whether nutrition, a retarded 
growth at the beginning, glands, an unstable nervous system 
or what not, modify the child’s intelligence. These changes 
are evidenced by the variations of the IQ. 


CONCLUSIONS 


1. The data on 3000 children—of whom 1000 had two tests, 
1000 three tests, and 1000 four tests—show that the IQ range, 
the chronological age at first test, and the interval elapsing 
between first and last tests, may all be eliminated as causes for 
variation in IQ on retest. 

2. Sex does not influence variations in IQ between first and 
last tests, for in the 3- and 4-Test Series results show: 


Boys Girls 

(1207) (793) 
ee 2% 2% 
I iis iiceankesncain 11% 8% 
| 87% 90% 
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3. A comparison of IQ’s on second, third and fourth tests 


indicates that: 


(a) 3 times as many cases decrease as increase on Test 2 
(b) 4 “é sé “é “c sé sé “ce ¢é “é 3 
(ce) 4 “é se sé cc se “cc “é ac “4 
(d) Those cases that increase 7 or more points on Test 2 

decrease 5 times as often as they increase on Test 3 


4. A study of Chronological Age on Test 2 as a factor in 
determining the direction of the variability of the IQ in Test 
2, shows that the older the child is, the less chance there is that 
his second IQ will go higher. 




















NEWS AND NOTES 


AN OPEN LETTER TO AMERICAN PSYCHOLOGISTS CONCERNING DEVELOP- 
MENT OF MODERN GRAPHOLOGY. 


Dear Sirs: 


Graphology, the science of analyzing a person’s character from his 
handwriting, has often been considered as a pseudo-scieuce in this country. 
This is not surprising, as any person may call himself—or frequently 
herself—a graphologist because no qualifications are required for this 
profession. Therefore, so-called graphologists are offering character 
analyses from two lines of handwriting or even from a signature only 
on boardwalks or in department stores for from 10¢ to 25¢, while a 
serious analysis requires many hours of study of manuscripts, preferably 
from different periods of the writer’s life. 

In European countries, graphology is considered as a daughter of psy- 
chology without whose knowledge graphologists cannot work. ‘‘La 
Société de Graphologie’’ in Paris, founded in 1871 in Paris, has been 
seriously working on the development of this science. This society pub- 
lished a bi-monthly periodical, ‘‘La Graphologie Scientifique,’’ with 
articles of internationally known graphologists, but it was discontinued 
when the Germans occupied Paris. For several years before the present 
war the International Congress of Graphologists used to meet in Paris 
and to discuss the development of graphology every year. 

An excellent periodical on graphology, ‘‘ Die Schrift,’’ later called 
‘*Graphologia,’’ was published by the Czecho-Slovakian scientists Dr. 
Otto Fanta and Willi Schoenfeld in Prag, but it ceased to appear after 
the Germans took over Czecho-Slovakia. 

The English ‘‘Autograph and Graphological Society’’ requires rigid 
oral and written examinations before accepting graphologists as members. 
One of the leading graphologists is Professor Max Pulver, of Zurich, who 
in his theory employs psycho-analysis as well as Prof. C. G. Jung’s 
findings. In his book ‘‘ Die Symbolik der Handschrift’’ (Symbolism of 
Handwriting), Pulver divides the writing space in three zones (upper- 
middle-lower zone) from which he draws his conclusions with respect to 
Form and Content of the Conscious, while by observing the writing move- 
ment from the left to the right he finds the expression of Introversion and 
Extroversion. 

The most meticulous study of movements of handwriting and their 
psychological value is found in Robert Saudek’s works, ‘‘ The Psychology 
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of Handwriting’’ and ‘‘ Experiments with Handwriting,’’ where abun- 
dant proof is given for each statement made by Saudek. He considers 
the speed of writing as the basis for any analysis and frequently refers to 
the research work of Prof. Frank N. Freeman, of Chicago, one of the 
foremost experts in this field. 

The French graphologist, J. Crépieux-Jamin, shows in his book ‘‘ Les 
Eléments de l’Ecriture des Canailles’’ the expression of insincerity in 
handwriting by hundreds of samples. 

The usefulness of graphology in many fields has been discovered in this 
country too. Medical circles have taken interest in the research work of 
modern graphologists. In 1939, the ‘‘ Medical Record’’ published an essay 
of Dr. Eric Alten, of New York, on ‘‘The Psychology of Handwriting 
and Its Importance to the Physician.’’ The American Journal of Psy- 
chiatry published last year part of the research work of the well known 
New York graphologist, Thea Stein Lewinson, on ‘‘ Dynamic Disturbances 
in the Handwriting of Psychotics.’’ 

The Pennsylvania Institute of Criminology in Philadelphia gave me an 
opportunity to lecture on the use of graphology in crime detection and 
crime prevention and to answer questions on this subject over the radio. 
Hans Jacoby, now in London, established his reputation by his book 
‘*Handscrift und Sexualitaet’’ (Handwriting and Sexuality). Paul 
Koch’s ‘‘ Kinderschrift und Charakter,’’ (Children’s Handwriting and 
Their Character) is valuable for any psychologist who is interested in 
problem children. 

Graphology has proven its value as a vocational guide. Historical 
research work has become interested in that science, as it is often the only 
means of getting an impartial character picture of a deceased person. 
In short, graphology is of greatest importance in cases where deeper 
insight into human nature and personality is required. 

RupDoLPH 8S. HEARNS. 


The Psychological Bulletin for June, 1941, is a special number devoted 
entirely to articles dealing with Military Science, with Dr. Carrol C. Pratt 
of Rutgers University as Editor. It has been prepared at the request of 
the Emergency Committee in Psychology appointed by the Division of 
Anthropology and Psychology of the National Research Council. The 
price of this issue is $1.00 and it can be purchased from the American 
Psychological Association, Inc., Northwestern University, Evanston, 
Tilinois. 

Sub-topies dealt with are as follows: Army motor transport personnel: 
Harry R. DeSilva, Phillip Robinson and Willis H. Frisbee, Jr.; Aviation: 
G. H. 8. Razran and H. C. Brown; Classification of military personnel: 
Thomas W. Harrell and Ruth D. Churchill; Effects of certain drugs on 
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mental and motor efficiency: S. D. 8S. Spragg; German military psy- 
chology: H. L. Ansbacher; Morale: a bibliographical review: Irvin L. 
Child; Motivation and learning in relation to the national emergency: 
O. H. Mowrer; Perception: Samuel W. Fernberger, et al.; Propaganda 
technique and public opinion: Bruce Lannes Smith; Psychological causes 
of war: Ross Stanger; Rehabilitation: Morris 8. Viteles; War neuroses: 
William H. Dunn, M.D.; The Gasiorowski bibliography of military psy- 
chology: H. L. Ansbacher. 


An interesting new test is the California Capacity Questionnaire, de- 
vised according to the authors, Elizabeth T. Sullivan, Willis W. Clark and 
Ernest W. Tiegs, ‘‘in response to a demand for a short, easily admin- 
istered, dependable measure of capacity, intelligence, or mental alert- 
ness.’’ The test may be taken in thirty minutes and is self-administering. 
It is constructed on the spiral plan, the problem situations drawing on 
various mental factors for solution recurring on levels of increasing diffi- 
culty. In spite of its brevity it yields a reliable M.A. and IQ for both 
language and non-language capacity. These measures are valuable for 
guidance, selection, and placement in business and industry as well as in 
schools. The test is published by the California Test Bureau, 3636 Beverly 
Boulevard, Los Angeles, California. 


The Committee for National Morale announces the publication of a book 
entitled ‘‘German Psychological Warfare,’’ a critical, annotated and 
comprehensive Survey and Bibliography prepared under the direction of 
Ladislas Farago with the cooperation of Professors Gordon W. Allport, 
E. G. Boring, Dr. 8. 8. Stevens and Dr. Beebe-Center of Harvard Univer- 
sity, Professor Kimball Young of Queens College, and Dr. Floyd Ruch 
of the University of Southern California. The work of 112 pages covers 
the period from 1846 to 1941, emphasizes the formidable Nazi program of 
the last decade, and contains an analytic foreword of 60 pages, arranged 
in question-and-answer form, outlining the origins and development of 
German military psychology, its impact abroad and its uses in the present 
war. This book may be obtained by writing to the Committee for Na- 
tional Morale, 51 E. 42nd Street, New York City. The subscription price 
is $2.50. 


The Minnesota Chapter of Psi Chi, 1940-1941, under the presidency of 
Dale B. Harris, has had published the last paper prepared by Dr. Fred- 
erick Kuhlmann, which he read before the Minnesota Chapter of Psi Chi 
on the occasion of his introduction as Honorary Member of that Chapter, 
March 20, 1941, about a month before his death. This contribution, en- 
titled ‘‘Our Changing Fashions in Methods of Research,’’ comes as a very 
fitting close to a long and distinguished scientific career. ‘‘Those who 
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knew Dr. Kuhlmann personally preserve a memory of a quiet, reserved 
individual who had much consideration and encouragement for his col- 
leagues. Perhaps one of the finest tributes that can be given any person 
is the testimony that he evoked, without exception, strong loyalty from 
his associates; this statement is preeminently true of Dr. Kuhlmann. Ex- 
emplifying so clearly in a life-time of work in Minnesota the scholastic 
and scientific ideals which Psi Chi professes, he was particularly well 
suited to its membership. ’’ 


The attention of the readers of this Journal is called to a new person- 
ality test ‘‘The Personal Audit,’’ constructed by Drs. Clifford R. Adams 
and William M. Lepley, of Pennsylvania State College. This instrument 
is made up of nine parts, with four-hundred fifty items, and has been 
designed for persons with a high school education or its equivalent. It is 
self-administering and may be used with any group of manageable size. 
The scoring is extremely simple, no keys required, and all nine parts can 
be scored in from four to five minutes. The nine sub-tests are numbered 
and tentatively described as follows: 

I. Sociability, extroversion; II. Suggestibility, a tendency to agree with 
authority; III. Susceptibility to annoyance, a tendency to irritability; 
IV. A tendency to rationalize, a tendency to make alibis and excuses; 
V. A tendency to anxiety, a tendency to excess emotionality; VI. A ten- 
dency to excessive sexual emotionality and conflicts; VII. A tendency to 
personal intolerance; VIII. Flexibility or docility of attitudes; IX. A 
tendency to think, possibly worry, about unsolved problems. Further 
information may be obtained by addressing the authors at Pennsylvania 
State College, State College, Pa. 


Sidney Hillman, Associate Director General of the Office of Production 
Management, has recently announced that defense training courses will 
be given priority in the nation’s vocational schools. The underlying pur- 
pose of this new plan is to foster a closer relationship between defense 
training and the known need for workers in defense industry, city by city, 
and state by state. Using the facilities of the 1,500 State and local 
employment offices of the United States Employment Service, which have 
already compiled registers as to the availability of labor, and demand for 
it, workers in a particular area will be referred from local employment 
office lists to training in the schools and for later essential defense jobs. 


More than 160 of the nation’s leading scientists and scholars, including 
thirty-two distinguished men and women who will be awarded honorary 
degrees, will report basic achievements and advances in learning in a 
five-day series of symposia sponsored by the University of Chicago, begin- 
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ning September 22, 1941. These gatherings will be held on the Midway 
Quadrangles in the week climaxing the celebration of the University’s 
Fiftieth Anniversary. Thirty-nine universities, including six in foreign 
nations, and fifteen museums, research organizations, and government 
agencies will be represented in the symposia. They will deal with newest 
fundamental advances in the biological, physical, and social sciences, the 
humanities, law, business, religion, and social service. Symposia subjects 
include aviation medicine, cancer of the lungs, the problem of over-abun- 
dant evidence in historical research, cosmic rays, oil geology, social impli- 
cations of vitamins, and many more. 


The Institute for Propaganda Analysis, 211 Fourth Avenue, New York 
City, has recently announced the election of new officers and a new admin- 
istrative secretary. Professor Kirtley Mather of Harvard University is 
now President, Professor F. Ernest Johnson of Columbia University, Vice- 
President; Professor Clyde R. Miller of Columbia University, Secretary ; 
Professor Alfred M. Lee of New York University, Treasurer. The Insti- 
tute, a non-profit, educational organization, was founded in 1937 to fur- 
nish material to individuals, schools and groups studying propaganda and 
public opinion. A monthly report, Propaganda Analysis, carries the 
Institute ’s current findings on propaganda to the members. 


The American Youth Commission of the American Council on Educa- 
tion has recently published a descriptive directory, Youth Serving Organi- 
zations, which contains a comprehensive survey of the structure, aims and 
activities of 320 organizations operating at a national level, composed 
either of youth or of adults whose programs serve the needs of youth. It 
is edited by Dr. M. M. Chambers, a staff member of the American Youth 
Commission, and a co-author of several other staff reports including How 
to Make a Community Youth Survey. 


The teaching of psychology in the 625 junior colleges in the country 
will receive intensive study this year as programs, curricula, staff, and 
other factors related to the more effective teaching of this science in 
junior colleges are investigated by a national committee appointed by 
President J. C. Miller of the American Association of Junior Colleges. 
The committee, with Dr. Louise Omwake of Centenary Junior College, 
Hackettstown, N. J., as chairman, will be known as the ‘‘Committee on 
Psychology in Junior Colleges.’’ It is an outgrowth of a special con- 
ference called by the Committee on Instruction of the American Associa- 
tion for Applied Psychology at Atlantic City in February to discuss 
instruction in psychology on the high school and junior college levels. 
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WILLARD L, VALENTINE. Experimental Foundations of General Psychol- 
ogy. (Revised Edition.) New York, Farrar and Rinehart, Inc., 
Publishers. 1941. Pp. xvi+432. $2.00. 

This book as the title indicates is a revision of a book by the same title 
and purpose published in 1938. The first ‘‘book was written to help sup- 
ply the need in a first course in psychology for reviews of contemporary 
experimental mm presented in such a way that the beginner can under- 
stand them . . . the contribution is neither exhaustive nor critical . 
controversial issues have been avoided. It is concerned mainly with the 
application of the scientific method to behavior problems. ...’’ Enough 
literature in each of the traditional fields is reviewed and digested ‘to 
show why the generalizations that are made in the text appear reasonable 
in the light of our present knowledge.’’ These generalizations are made 
quite obvious by being set in italics. 

In the preface to the second edition the author states, that as the result 
of many suggestions, he has added more interpretation and systematiza- 
tion to the second edition to make it more integrated and its exposition 
clearer to the immature student. He has also eliminated as many techni- 
ealities as possible and as a result has a book which any college freshman 
should be able to read, and understand, if not enjoy. 

There are 20 chapters plus a summary, each chapter treating some 
important and interesting topic in a specific, concise manner and ending 
in a summary. The material is amply illustrated with pictures, tables, 
diagrams, and graphs. The appearance is attractive, and freshmen will 
find it more interesting than the average beginning text-book in psychol- 
ogy. In fact, it seems to me that it would make an excellent text for a 
one semester course. If it were so used, there might be fewer freshmen 
failing psychology, and there would be more students taking a second 
course. 

T. C. Scorrt, 
Ohio University 


Britt, Stewart HENDERSON. Social Psychology of Modern Life. New 
York, Farrar and Rinehart, Inc. 1941. Pp. xviii +562. 

When a new text-book appears in the field of social psychology one 
wonders why. The state of confusion in the field as it exists at present 
has been well pointed out, among others, by Smoke, and more récently 
by Kantor. A new text might be characterized by the selection of mate- 
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rials, or it might be characterized by the author’s belief that one ap- 
proach, say historical, or experimental, is more fruitful than another. It 
might be written by a man with a message, setting forth the point of 
view of a particular school or system, or in the interest of some sort of 
reform. The writer of Social Psychology of Modern Life does not seem 
to have ‘been influenced by any of these considerations. The material 
presented is in no way eccentric. It is voluminous, so much so in fact 
that, despite the 1 early 600 pages, many topics are little more than men- 
tioned in passing. It seems to this reviewer, however, that the writer 
has apportioned space to various subjects in reasonable conformity with 
their relative importance. For example, in the subject index, under 
‘*defense mechanism’’ one page reference is given, under ‘‘dating’’ four 
page references, under ‘‘culture’’ about seventy, some of which involve 
several pages, besides many more under cultural lag, culture patterns, etc. 

The method of approach, while largely empirical, can hardly be said 
to be such as to characterize the book on the basis of method. The mate- 
rials having been drawn from all sorts of sources, all sorts of methods, 
experiments, test results, field studies, even history, journalism, and the 
arm chair provide the materials of the book. This is not to say that the 
writer is unaware of the importance of method. He clearly distinguishes 
between fundamental experimental material and excerpts from news 
stories or speeches which he uses by way of illustration. It is only meant 
to say that the methods like the material are eclectic and hence do not 
characterize the book. 

Nor can we regard the writer as a man with a message, having some 
particular ax to grind. The book neither demonstrates how the teaching 
of a special school or system works out in social psychology, nor is it a 
thesis for any reform or set of moral values. The writer does not obscure 
his adherence to a liberal view, the democratic ideal, and certain other 
values not without moral significance, but certainly there is nothing of 
the tract about the book. Since it does not depart from the usual or 
conventional in subject matter or method of approach, and is written for 
no discernible ulterior purpose, it might seem to the casual reader to be 
just another text-book. And perhaps it is no more than that. Yet it is 
almost certain to appeal to many teachers as a text-book with a differ- 
ence. The outstanding characteristics are three. It either discusses or 
touches upon a wide range of material, it is written in an exceptionally 
interesting style and it contains a great deal in the way of examples, 
illustrations of facts and principles, and applications drawn from ordi- 
nary life situations. It impresses one as being very much the kind of 
book that students will like. With it as a text the course in social psy- 
chology is pretty sure to be interesting. While not unique in any of the 
above respects, the combination of a wide variety of material with fortu- 
nate pedagogical devices and an interesting style justify the book. While 
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some will object to its emphasis on fact and illustration, and will regard 
it as insufficiently critical, it will hardly be regarded, certainly not justly 
so, as a merely popular text. If it aims to be popular, it also aims to 
be informative. The student emerging from a course in which it is used 
as a text will know a lot about social psychology, and will have an in- 
creased insight into many phases of contemporary life. Whether his 
knowledge is very well systematized, and whether it is integrated with 
the rest of psychology is another question, but one for which Britt’s book 
will not be particularly more responsible than would many another book 
in the field. 

There are several full page illustrations, a fairly extensive bibliography, 
and the usual author and subject indexes. 

Amos C. ANDERSON, 
Ohio University 


HUuLL, CuarK L., HovLanp, Cart I., Ross, Ropert T., HALL, MARSHALL, 
Perkins, DoNALD T., and Fircu, FrEeDERIC B. Mathematico-Deduc- 
tive Theory of Rote Learning. A Study in Scientific Methodology. 
Pp. xii+329. $3.50. 

This work is one of the first systematic attempts to apply to biological 
science the mathematico-deductive methodology which has been so fruitful 
in the physical sciences. The only other substantial effort in this direc- 
tion was made by J. H. Woodger in his Axiomatic Method in Biology 
(1937). 

The problem is to construct, from a collection of undefined terms and 
a set of postulates, a theory of rote learning which is not only internally 
consistent in the logical sense but externally consistent with the empiri- 
cally observed world. It can searcely be said that the problem has been 
solved (nor do the authors make any such claim), but there is ground for 
asserting that the attempt is in some respects more interesting than the 
solution anyway. The problem is attacked by logical and mathematical 
methods having nothing in common with such other recent efforts to 
mathematicize psychology as Lewin’s ‘‘topological’’ approach. 

Professor Hull and his collaborators have wisely confined their attempt 
at psychological systematization to the theory of rote learning, a restric- 
tion which has at least two advantages: (1) abundant quantitative experi- 
mental data are at hand, from which the assumptions of the system may 
be constructed and by which its conclusions may be checked; (2) the con- 
cepts involved in the study of rote learning, while complex by comparison 
with those of physics, are considerably simpler than those involved in 
most other branches of psychology. 

After an introductory chapter on scientific methodology, and a discus- 
sion of the experimental procedure on which the theory is based, the 
exposition of the formal system begins with a list of the undefined con- 
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cepts. Of these there are sixteen, some being observables, such as subject 
and syllable exposure, and others unobservables, such as stimulus trace 
and inhibitory potential. Extreme operationalists may object to the 
presence of unobservables among the undefined concepts, but their inclu- 
sion will not distress those who feel no pain when a physicist speaks of 
entropy or force. Each undefined term is listed twice: first in English, 
with a brief informal ‘‘explanation’’ of its ‘‘meaning’’ (not a defini- 
tion, of course, in the formal sense); secondly, in the language of sym- 
bolic logic. Dually presented in the same way is a list of eighty-six defi- 
nitions; one further defined term, inadvertently omitted from this list, 
will be found in a footnote on p. 128. In an appendix are an additional 
thirty-four definitions of concern only to the reader interested in the 
symbolic-logical development of the system. 

Next follow the eighteen postulates of the system; seven more, of little 
psychological interest but necessary for the rigorous derivation of theo- 
rems, are listed in an appendix. Each postulate is stated first in English, 
then in logical symbolism, and is followed by a discussion of the experi- 
mental results which led the authors to frame it. Interspersed among 
the postulates are a number of corollaries, and following the postulates 
is the real body of the theory, the collection of theorems and problems. 
The statement of each theorem and corollary (except for the last seven 
theorems) is followed by its mathematical proof, and this by a relatively 
non-mathematical discussion of the proposition. While the proofs make 
no severe demands on the reader’s mathematical training—an acquain- 
tance with the calculus will suffice—these informal sections will probably 
be the most interesting to the average psychologist. In some cases a third 
section is appended, discussing the experimental evidence bearing on the 
empirical truth of the proposition. 

A word should be said about the place of symbolic logic in the mono- 
graph. The logical symbolism used is, like Woodger’s, a slight modifica- 
tion of that employed by Whitehead and Russell in their Principia 
Mathematica. As noted above, the terms of the system (defined and un- 
defined), and the postulates, are set forth in this symbolism; but, unfor- 
tunately for the reader interested in the applications of symbolic logic, 
the corollaries and theorems are not. The practical obstacle to publish- 
ing the symbolic proofs will be apparent from the single case in which 
this has been done as an example: the informal proof of the corollary to 
Postulate 1 occupies a single line; the symbolic proof, given in an ap- 
pendix, requires three pages, and even so is not given in full detail. The 
difference, of course, lies in the fact that what is intuitively obvious on 
the informal linguistic level may be extremely difficult to prove rigorously. 
Undoubtedly symbolic logie is the instrument par excellence for precise 
reasoning, but the question may be raised whether it contributes much 
of importance in the early stages of the process of systematizing such a 
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science as psychology, in which even the broad outlines of the theoretical 
structure are still rather obscure. 

In the opinion of the reviewer, the chief interest of the work lies in 
its methodological aspect, rather than in its contribution to our knowledge 
of rote learning. This is not to say that the psychologist will not find 
suggestive the conclusions drawn from such assumptions as Postulate 2, 
which postulates a connection between rote learning and trace conditioned 
reactions, but rather to emphasize the fact that this is a pioneering effort, 
by a group of scholars in diverse fields, to apply the techniques of one 
branch of science to the unification and systematization of another. To 
mathematicians and logicians, as well as to psychologists, the book should 
be interesting as an example of the ever-widening application of abstract 
deductive methods free of the metaphysical incubus. 

D. D, MILLER, 
Ohio University 


PsycHEe CATTELL. The Measurement of Intelligence of Infants and 
Young Children. New York, Psychological Corporation. 1940. 
Pp. 274. 

It is only to be expected that the apparent success of test batteries for 
school children should be followed by parallel tests designed for infants. 
There is always the desire, and in certain situations the need, to predict 
mental level from a base determined as early in life as possible. How- 
ever, the several attempts to devise testing instruments for very young 
children have not met with the success of batteries for older children. 
This has been because of the nature of test items available, because of 
the more serious effects of fatigue, illness, or distraction on the younger 
child’s performance, and because of the large part that purely physical 
maturation plays in the infant’s development. 

In the present monograph Dr. Cattell describes a scaled test for ages 
2 months to 30 months. There are five items (with one or two alter- 
nates) at each month from 2 to 12, at each second month from 14 to 24, 
and at 27 and 30 months. The items at the lower ages are adapted from 
Gesell, and at the higher blend with the Terman-Merrill Form L, of which 
it can be considered a downward extension. 

‘*The standardization of the tests is based upon 1346 examinations 
made on 274 children at the ages of three, six, nine, twelve, eighteen, 
twenty-four, thirty and thirty-six months.’’ This statement is somewhat 
misleading as it implies the standards are based on eight examinations 
of each of 274 children. Actually each child had on the average only 5 
examinations, and from the detailed table showing item successes it is 
evident that no item was given to all of the 274 children. In fact the 
size of the groups used for each item at the various ages ranged from 19 
to 206, with a very small proportion having fewer than 50. As has been 
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found with previous infant scales I1.Q. values exhibit erratic behavior in 
individual cases on successive tests, although there is a strong tendency 
for the higher children to remain high, and the lower to remain low. 

Dr. Cattell gives valuable suggestions for examining young children and 
warns against using the scale as an instrument of precision. The quan- 
titative characteristics of the scale make it valuable for research purposes. 
However, its relatively low validity (r=.10 between 3 and 36 months, 
and .71 between 2 years and 3 years) makes individual prediction hazard- 
ous unless experienced clinical judgment supports the test score. This 
new scale appears to be a useful addition to present instruments. Its 
value can only be proved by use. 


C. M. Lourtir, 
Indiana Uniwersity 
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